Optimize Your System z ROI with z Operational Insights (zOI)

Hopefully all System z users are aware of the Monthly Licence Charge (MLC) pricing mechanisms, where a recurring charge applies each month.  This charge includes product usage rights and IBM product support.  If only it was that simple!  We then encounter the “Alphabet Soup” of acronyms, related to the various and arguably too numerous MLC pricing mechanism options.  Some might say that 13 is an unlucky number and in this case, a System z pricing specialist would need to know and understand each of the 13 pricing mechanisms in depth, safeguarding the lowest software pricing for their organization!  Perhaps we could apply the unlucky word to such a resource.  In alphabetical order, the 13 MLC pricing options are AWLC, AEWLC, CMLC, EWLC, MWLC, MzNALC, PSLC, SALC, S/390 Usage Pricing, ULC, WLC, zELC and zNALC!  These mechanisms are commercial considerations, but what about the technical perspective?

Of course, System z Mainframe CPU resource usage is measured in MSU metrics, where the usage of Sub-Capacity allows System z Mainframe users to submit SCRT reports, incorporating Monthly License Charges (MLC) and IPLA software maintenance, namely Subscription and Support (S&S).  We then must consider the Rolling 4-Hour Average (R4HA) and how best to optimize MSU accordingly.  At this juncture, we then need to consider how we measure the R4HA itself, in terms of performance tuning, so we can minimize the R4HA MSU usage, to optimize cost, without impacting Production if not overall system performance.

Finally, we then have to consider that WLC has a ~17-year longevity, having been announced in October 2000 and in that time IBM have also introduced hardware features to assist in MSU optimization.  These hardware features include zIIP, zAAP, IFL, while there are other influencing factors, such as HyperDispatch, WLM, Relative Nest Intensity (RNI), naming but a few!  The Alphabet Soup continues…

In summary, since the introduction of WLC in Q4 2000, the challenge for the System z user is significant.  They must collect the requisite instrumentation data, perform predictive modelling and fully comprehend the impact of the current 13 MLC pricing mechanisms and their interaction with the ever-evolving System z CPU chip!  In the absence of such a simple to use reporting capability from IBM, there are a plethora of 3rd party ISV solutions, which generally are overly complex and require numerous products, more often than not, from several ISV’s.  These software solutions process the instrumentation data, generating the requisite metrics that allows an informed decision making process.

Bottom Line: This is way too complex and are there any Green Shoots of an alternative option?  Are there any easy-to-use data analytics based options for reducing MSU usage and optimizing CPU resources, which can then be incorporated into any WLC/MLC pricing considerations?

In February 2016 IBM launched their z Operational Insights (zOI) offering, as a new open beta cloud-based service that analyses your System z monitoring data.  The zOI objective is to simplify the identification of System z inefficiencies, while identifying savings options with associated implementation recommendations. At this juncture, zOI still has a free edition available, but as of September 2016, it also has a full paid version with additional functionality.

Currently zOI is limited to the CICS subsystem, incorporating the following functions:

  • CICS Abend Analysis Report: Highlights the top 10 types of abend and the top 10 most abend transactions for your CICS workload from a frequency viewpoint. The resulting output classifies which CICS transactions might abend and as a consequence, waste processor time.  Of course, the System z Mainframe user will have to fix the underlying reason for the CICS abend!
  • CICS Java Offload Report: Highlights any transaction processing workload eligible for IBM z Systems Integrated Information Processor (zIIP) offload. The resulting output delivers three categories for consideration.  #1; % of existing workload that is eligible for offload, but ran on a General Purpose CP.  #2; % of workload being offloaded to zIIP.  #3; % of workload that cannot be transferred to a zIIP.
  • CICS Threadsafe Report: Highlights threadsafe eligible CICS transactions, calculating the switch count from the CICS Quasi Reentrant Task Control Block (QR TCB) per transaction and associated CPU cost. The resulting output identifies potential CPU savings by making programs threadsafe, with the associated CICS subsystem changes.
  • CICS Region CPU Constraint: Highlights CPU constrained regions. CPU constrained CICS regions have reduced performance, lower throughput and slower transaction response, impacting business performance (I.E. SLA, KPI).  From a high-level viewpoint, the resulting output classifies CICS Region performance to identify whether they’re LPAR or QR constrained, while suggesting possible remedial actions.

Clearly the potential of zOI is encouraging, being an easy-to-use solution that analyses instrumentation data, classifies the best options from a quick win basis, while providing recommendations for implementation.  Having been a recent user of this new technology myself, I would encourage each and every System z Mainframe user to try this no risk IBM z Operational Insights (zOI) software offering.

The evolution for all System z performance analysis software solutions is to build on the comprehensive analysis solutions that have evolved in the last ~20+ years, while incorporating intelligent analytics, to classify data in terms of “Biggest Impact”, identifying “Potential Savings”, evolving MIPS measurement, to BIPS (Biggest Impact Potential Savings) improvements!

IBM have also introduced a framework of IT Operations Analytics Solutions for z Systems.  This suite of interconnected products includes zOI, IBM Operations Analytics for z Systems, IBM Common Data Provider for z/OS and IBM Advanced Workload Analysis Reporter (IBM zAware).  Of course, if we lived in a perfect world, without a ~20 year MLC and WLC longevity, this might be the foundation for all of our System z CPU resource usage analysis.  Clearly this is not the case for the majority of System z Mainframe customers, but zOI does offer something different, with zero impact, both from a system impact and existing software interoperability viewpoint.

Bottom Line: Optimize Your System z ROI via zOI, Evolving From MIPS Measurement to BIPS Improvements!

21st Century Mainframe Capacity Planning Requirements

With nearly 5 decades of longevity the IBM Mainframe has changed beyond recognition in terms of CPU capacity and performance capability.  The Capacity Planning discipline for the IBM Mainframe server became more advanced and proactive in the early 1990’s, perhaps coinciding with the introduction of Parallel Sysplex structures associated with the MVS/ESA operating system.  Therefore the requirement to measure and model the impact of workload movement between LPAR and CPC structures became important, if not mandatory.

The fundamental building-block for Mainframe CPU usage analysis is SMF Type 7n records (I.E. RMF or CMF), where this data was typically processed by MXG, MICS and maybe CIMS (acquired by IBM), generally using SAS for reporting purposes.  Other tools, including but not limited to, BEST/1 (acquired by BMC) and PERFMAN (acquired by ASG) also offered capacity planning and performance management solutions.  Therefore, for 20+ years the fundamental Mainframe CPU usage data and associated tools have remained largely the same.  However, maybe the IBM Mainframe server has changed, both in terms of underlying CPU chip technology and customer workload deployment…

I often hear capacity planners state something along the lines of “I can report on the past with 100% accuracy, but predicting the future might prove to be a little more difficult”!  Once again, going back to the early 1990’s, the IBM Mainframe had a typical if not generic workload profile deployment, namely On-Line Transaction Processing (E.g. CICS, IMS DC) and related Database Management Subsystems (E.g. DB2, IMS DB) with Batch Processing.  This somewhat limited workload profile simplified the Capacity Planning process, applying estimates of growth based on current usage.  However, when the Mainframe became more pervasive, taking on new workloads, how was the capacity planner supposed to estimate CPU requirements for their new business application workload?

IBM introduced the Large Systems Performance Reference (LSPR) methodology, designed to provide relative processor capacity data for IBM System/370, System/390 and z/Architecture processors.  All LSPR data is based on a set of measured benchmarks and analysis, covering a variety of System Control Program (SCP) and workload environments.  LSPR data is intended to be used to estimate the capacity expectation for a production workload when considering a move to a new processor.  Although LSPR data is provided on an “as is” basis, with no warranty, it at least provides the Mainframe Capacity Planner with some insight into their CPU sizing challenge.  For many years, LSPR provided the only other data source, as well as RMF (CMF) for Mainframe CPU sizing.  However, is there a more accurate data source, perhaps based on real-life customer data?

With the introduction of the IBM System z10 server (February 2008), a new function CPU MF (CPU Measurement Facility) was incorporated.  Let’s not forget, z10 is now an n-2 technology, having been superseded by the z196/z114 and the latest zBC12/zEC12 generation of servers.  So each and every committed Mainframe customer should be positioned to benefit from the CPU MF function.

CPU MF provides optional hardware assisted collections of information about logical CPU activity executed over a specified interval in selected Logical Partitions (LPARs).  The CPU MF counters function is intended to be run on a constant basis to collect long-term performance data (I.E. SMF Record 113), in a similar manner to how you collect other performance data.  Therefore this data source can be deployed to further refine the accuracy of Mainframe CPU capacity planning projections.  Let’s not forget:

The primary on-going requirement for Mainframe Capacity Planning is to minimize any over or under capacity provision from forecast predictions, used for Mainframe server acquisition purposes”

Mainframe chip technology has also changed in complexity, especially with the latest iterations of CPU chips associated with the z10 server (E.g. POWER 6) onwards, incorporating many layers of cache memory.  Workload capacity performance will be quite sensitive to how deep into the memory hierarchy the processor must go to retrieve the workload’s instructions and data for execution.  Best performance occurs when the instructions and data are found in the cache(s) nearest the processor so that little time is spent waiting prior to execution; as instructions and data must be retrieved from farther out in the hierarchy, the processor spends more time waiting for their arrival.

As workloads are moved between processors with different memory hierarchy designs, performance will vary as the average time to retrieve instructions and data from within the memory hierarchy will vary.  Additionally, once on a processor this component will continue to vary significantly as the location of a workload’s instructions and data within the memory hierarchy is affected by many factors including; locality of reference, IO rate, competition from other resources (E.g. Applications, LPARs, et al), and so on…

The most performance sensitive area of the memory hierarchy is the activity to the memory nest, namely, the distribution of activity to the shared caches and memory.  IBM introduced new terminology, namely Relative Nest Intensity (RNI), indicating the level of activity to this part of the memory hierarchy.  Using data from CPU MF, the RNI of the workload running in an LPAR may be calculated.  The higher the RNI, the deeper into the memory hierarchy the processor must go to retrieve the instructions and data for that workload.

Therefore the Mainframe Capacity Planner does have various data sources available to forecast how an existing or new workload might perform on an upgraded processor (CPC), further refining their CPU capacity requirement forecast.  As always, the final stage in a Mainframe Capacity Planning process is to input the forecast data into the IBM Processor Capacity Reference (zPCR) tool, to determine the exact model and associated resource configuration options for their unique business workload mix.

To summarize, does your Mainframe Capacity Planning process incorporate all of these CPU sizing data sources, in an easy-to-use and cost efficient manner?

Founded by former IBM staffers and capacity planning and performance management industry veterans William Shelden, PhD, and William Hart, PerfTechPro is designed to deliver sophisticated, affordable, easy-to-use solutions for IT management professionals looking for fast, insightful help without high-cost, complex and time-consuming purchasing and licensing requirements.

PerfTechPro for z/OS is a Capacity Planning and Performance Measurement tool specifically designed for the cost conscious and savvy 21st Century data centre.  PerfTechPro for z/OS is the next evolution in Mainframe Capacity Planning tools, having been architected from ground zero using the latest techniques.  PerfTechPro for z/OS provides sophisticated capacity and performance management capabilities, affordable by any sized data centre:

  • Clean, intuitive, easy-to-use interface and graphical representations, for example:
    • Consolidated instance lists guide users to make informed selections
    • Descriptive dialog boxes detail your configuration
    • Anticipates, pre-loads data to speed retrieval, reporting and analysis
    • Automated data management
  • Forecasting and modelling
  • Non-proprietary database, enabling data use outside of PerfTechPro
  • Capable of automated collection, analysis and reporting of SMF 113 records produced by the IBM CPU Measurement Facility (CPU MF)
  • Supports measurement, management of zAAP & zIIP Specialty Engines
  • Automated analysis and management of all key capacity and performance metrics, for example:
    • GPP Utilization of All LPARs
    • MIPS Usage by CPU
    • DASD Response Times
    • Address Spaces Dispatched and Waiting 

PerfTechPro for z/OS also simplifies the data management process associated with Mainframe Capacity Planning.  Using a streamlined process on the z/OS host, PerfTechPro extracts and formats the data required from various SMF sources (E.g. SMF Type 7n, Type 113); delivering an optimized Performance Data Base (PDB) for use by the Windows based GUI.  This optimized file safeguards fast processing during the reporting and forecasting activities, while simplifying any data aggregation processes (E.g. Weekly, Monthly, et al).  Moreover, PerfTechPro allows this data to be stored in non-proprietary (E.g. Microsoft Access, SQL Server, MySQL, Oracle) and multiple database structures, as and if required.

PerfTechPro for z/OS is a simple-to-use and cost-efficient solution, allowing customers to quickly save time and money from their Capacity Planning and Performance Measurement solution.  Ultimately the bottom line objective for PerfTechPro for z/OS is to provide a best-of-breed solution for a very competitive cost. PerfTechPro for z/OS delivers business value by:

  • Ensuring enterprise zSeries Mainframe server resources are being used efficiently
  • Maximizing opportunities for cost-savings
  • Anticipating & responding to increased demand on resources
  • Reducing costs by exploiting periods of lower resource demand
  • Discerning underlying causes of performance and capacity issues
  • Eliminating time-consuming manual tracking, recording and analysis
  • Implementing disciplined management of valuable business resources

In conclusion, the Mainframe Capacity Planning process continues to evolve, forever striving to reduce any discrepancies in CPU requirements forecasting, which of course, have a high associated cost consideration.  Integrating CPU MF (SMF Type 113) must be a mandatory requirement, safeguarding that CPU Sizing, Forecasting, Modelling and Correlation Analysis activities are optimized.  Additionally, the actual process of Mainframe Capacity Planning is an activity that requires great skill and considerable associated responsibility.  A modern day solution such as PerfTechPro for z/OS is worthy of consideration, having been designed by a team with a heritage in delivering Mainframe Capacity Planning solutions, architecting function compatible with modern day functionality, while considering the latest technology zSeries CPU chip design considerations.