Revisiting The zSeries Mainframe Storage Hierarchy

Recommendation: The next time you perform a zSeries Mainframe server upgrade, consider adding Flash Express cards, for an extra 1.4-5.6 TB of memory speed storage. Similarly, the next time you perform a zSeries Mainframe DASD subsystem upgrade, consider adding as much SSD (flash memory) capability that you can afford and justify. Both upgrades will deliver significant performance and business benefits, arguably for minimal cost, when considered as a several year TCO investment.

Conceptually the zSeries Mainframe storage hierarchy has comprised the same layers for many decades, while performance and capacity attributes have dramatically increased over time. Although System/390 introduced the concept of Expanded Storage (I.E. Hiperspace, Data Space) in 1990 and there have been various implementations of SSD (E.g. StorageTek 4080), the ability to transparently implement significant capacity memory layers has only recently become possible.

Let’s not forget, the closer data is to that most precious and expensive of resources, namely CPU, the faster it will process. When revisiting the traditional storage hierarchy, we can now consider two new layers, namely Flash Express and Solid State Drive (SSD):

zSeries Storage Hierarchy

I have previously written about the Flash Express layer. Flash Express is a new memory layer within the zSeries Mainframe storage hierarchy, which can be considered as either a Solid State Drive (SSD) or Storage Class Memory (SCM) technology. Flash Express is integrated on PCI Express attached RAID 10 Cards, packaged as a two card pair, each with a 1.4 TB capacity per mirrored card pair. A maximum of 4 card pairs can be configured, delivering up to 5.6 TB of memory capacity, assigned to LPAR resources, just like main memory.

The simplest function to benefit from Flash Express memory would be SVC dump processing, substantially reducing dump capture time.

Flash Express can also be deployed to replace z/OS disk paging, substantially reducing the response time associated (I.E. ~5-20 μs vs. ~10 ms). The benefit for z/OS paging is not the replacement of memory paging, but replacing disk paging with Flash Express storage. Flash Express is suitable for workloads that can tolerate paging, but will not benefit workloads that cannot tolerate paging activity. The fundamental z/OS design for Flash Express memory will not completely remove any virtual storage constraints created by a paging spike, although a modicum of scalability relief is expected due to the faster I/O associated with Flash Express memory.

In conjunction with Flash Express, there were advancements in the Real Storage Management (RSM) function, including pageable 1MB Large Page Support. Large Pages (1MB) deliver benefit, with increased performance, decreasing the number of Translation Lookaside Buffer (TLB) misses that an application incurs, reducing time when converting virtual addresses into physical addresses and reduced real storage usage to maintain DAT structures. The use of Large Pages typically deliver Internal Throughput Rate (ITR) performance benefits of ~1% for IMS, ~3% for DB2 and ~5% for Java workloads.

Although SSD (flash) storage might have been selectively deployed in the zSeries Mainframe Data Centre for the last 5 years or so, the ever increasing requirement for increased Quality of Service (QoS) in terms of data availability and ultra-fast transaction response times dictate the increased usage of SSD architectures. Entire DASD subsystems can be built upon SSD technologies, or more likely, hybrid subsystems, containing both SSD and traditional HDD technologies. This storage subsystem evolution allows organizations to gain significant competitive advantages, delivering new services for existing and more importantly, new customers alike.

Using SSD disk subsystems, overcomes the limitations of traditional spinning hard disk drives. However, not every enterprise application needs this ultra-high performance; since flash storage still costs more than spinning drives for the same capacity, organizations must be mindful of expenditure and now much flash memory (SSD) they deploy; as always, flexibility is key.

Complete or hybrid SSD I/O subsystems deliver performance and economic advantages for your mission critical business environment:

  • Green Data Centre: ~25-60% energy reduction (flash memory vs. spinning disk)
  • Data Centre Space: ~20-40% smaller footprint (memory cards vs. Hard Disk Drives)
  • Optimal Performance: Consistent ~1-3 ms access (Hard Disk Drives @ ~10 ms)

The utopia is for a self-tuning disk subsystem, automatically redirecting I/O between SSD and HDD, based on file performance and overridden, as and when required, by storage policies. Whether EMC, HDS (HP OEM) or IBM, this self-tuning ability is evolving, while each disk vendor has their own implementation. However, whatever your choice of disk subsystem, the ability to incorporate SSD into your storage hierarchy, either full or partial is evident.

In conclusion, ~25 years ago, the zSeries Mainframe user benefitted from faster performance via System/390 Expanded Storage and disk subsystems with cache and DASD Fast Write memory buffers. The cost of such memory storage was a major consideration then, but with good I/O tuning disciplines, the savvy zSeries Mainframe user benefitted from these technology advancements. Flash Express and SSD deliver the potential to deliver increased performance, for a relatively low cost, and now is the time to embrace these technologies. Ignore the storage hierarchy at your peril and as I previously documented, optimal I/O performance always delivers significant benefit.

IFL – A Cost Efficient zSeries Platform?

In September 2000, IBM introduced the Integrated Facility for Linux (IFL) processor, a specialty engine for and some might say dedicated to running the Linux Operating System.  At the time of this announcement, companion software named S/390 Virtual Image Facility for Linux was introduced to assist in the rapid deployment of IFL configurations, especially for non-Mainframe personnel.  However, this product was quickly discontinued, in favour of the standard z/VM Operating System, which is not difficult to learn and can accommodate hundreds if not thousands of zLinux images.

Today, the IFL is still a processor dedicated to Linux workloads on IBM System z servers.  The IFL is supported by z/VM virtualization and the Linux operating system.  The IFL cannot run other IBM operating systems.  The competitively priced IFL processor is a CPU capacity enabler, exclusively for Linux workloads.  Linux deployment (I.E. SUSE & Red Hat) on IFL’s can reduce expenses in the areas of operational efforts, energy, floor space and especially software.

The IFL provides the following functions and benefits:

  • The IBM Enterprise Linux Server is a dedicated System z Linux server, comprised of only IFL processors
  • No additional IBM software charges for traditional (E.g. z/OS, CICS, DB2, WebSphere, et al) environment
  • Performance improvement for Linux workloads with each successive generation of IFL and System z technology
  • Linux workload on the IFL does not result in increased IBM software charges for traditional System z operating systems and middleware
  • Same functionality as a General Purpose processor on a System z server
  • HiperSockets can be used for communication between Linux images, or Linux and other operating system images on the same System z system
  • z/VM virtualization and most IBM Linux middleware products, plus most vendor software products are priced per processor (core) according to the System z IBM International Program License Agreement (IPLA).  IPLA products have a one-time-charge (OTC) and an annual (optional) maintenance charge, called Subscription & Support
  • Supported by the current z/VM virtualization and IBM Wave for z/VM software versions
  • Always a full capacity processor, independent of the capacity of the other processors in the server
  • Orderable as a System z hardware feature. The number of orderable IFL features varies by the server model and configuration
  • Designed to operate asynchronously with other General Purpose processors
  • Managed by PR/SM in logical partition with dedicated or shared processors. The implementation of an IFL requires a Logical Partition (LPAR) definition, where following normal LPAR activation procedure, LPAR defined with an IFL cannot be shared with a general purpose processor.

There will always be the debate as to which processor and associated server type (E.g. x86, POWER, SPARC) is the most cost efficient, but there is no doubt that the ability to accommodate hundreds if not thousands of zLinux instances in one zServer environmental (E.g. Power, Cooling, Floor Space, et al) friendly footprint, with software pricing per core is worthy of consideration.

Adoption for zLinux has been steady and especially in the emerging territories where it’s not unusual for zSeries deployments to be totally zLinux (I.E. IBM Enterprise Linux Server) based.  Moreover, the majority of large and traditional IBM Mainframe users (I.E. z/OS) have installed at least one IFL, if only to evaluate the z/VM and zLinux offering.  Many have deployed the IFL and associated zLinux solution for business requirements.

Therefore, if one of the major cost benefit features of IFL is optimized software costs; can the IFL processor be considered for other workloads, originating from the traditional zSeries (I.E. z/OS) environments?

Proximal Systems Corporation (PSC) is a company with a solution that transparently offloads data processing from IBM Mainframes to Distributed Systems, with an objective of reducing software cost, while maintaining or improving performance.  The company name is derived from the concept of bringing disparate computing systems into close proximity, functionally speaking, providing totally seamless and transparent interoperability.  The result is a unified computing complex within which various tasks can be easily migrated between systems to their most cost efficient operating environment, while still being able to interoperate as if they were all hosted together on the same system.

The PSC Proxy Coupling Technology allows for a CPU orientated task to be offloaded from one system to another by means of an associated proxy task, which has an identical interface as the task to be offloaded, but delegates the majority of the processing to an offloaded task on another system.  The primary objective of this function are for the cost savings and/or performance improvements that might be delivered by migrating tasks to systems that are able to execute those tasks more efficiently.

The fact that the proxy task maintains the same interface as the application being replaced is crucial; as many past Mainframe migration projects have failed due to insurmountable interoperability problems between the Mainframe and Distributed Systems servers (I.E. Windows, Linux, UNIX, et al).  Proxy Coupling Technology offers a solution to this long-standing challenge.  In theory, this allows for the transparent offload of a traditional z/OS workload (E.g. Sort) from General Purpose (GP) processors, to a less expensive (E.g. IFL) alternative…

In the first instance, the Proxy Coupling Technology offloads General Purpose CPU workload associated with the z/OS sort (I.E. CA Sort, DFSORT, Syncsort) function, to another platform (E.g. IFL).  For IFL based implementations, HyperScokets are utilized to transfer data at memory speeds from the z/OS task to zLinux on the IFL, where the sort operation completes, while the resulting z/OS task and associated data are maintained, as per normal.  From an IFL viewpoint, Ahlsort software performs the sort operation, being a sort solution that maintains compatibility with the majority of z/OS sort function (I.E. Control Card Syntax).  Therefore, this is a transparent implementation, where the only consideration is how much CPU capacity is required for the offload function (E.g. IFL, x86).  The benefits are reduced z/OS MSU usage for the sort function, which can be quite significant, as most business data (E.g. Database Offloads, Customer Orientated, et al) is sorted on a daily if not more frequent basis.

Just as IBM introduced the zAAP on zIIP capability, which allowed some customers to more easily justify a specialty engine (I.E. zIIP), combining workloads to exploit the full capability of the specialty engine; in theory the same ethos applies with the Proxy Coupling Technology.  For the avoidance of doubt, workloads that can be processed on an IFL, such as z/OS sort tasks, can assist in delivering higher Return On Investment (ROI) levels for the IFL, for example:

  • Reduced z/OS WLC MSU usage (I.E. Sort function offload) and associated software costs savings
  • IFL processors run at Full Speed and do not add to traditional workload (I.E. z/OS) software costs
  • Utilize any spare IFL CPU resource not used, releasing General Purpose CPU resource for other work

In conclusion, the Proxy Coupling Technology offers a proposition that is similar to the IBM philosophy of reducing z/OS software costs via specialty engines.  Seemingly to date, primarily only the zIIP and zAAP specialty engines were available to optimize CPU usage for z/OS workloads.  Offloading CPU cycles and thus MSU workload to IFL makes sense, utilizing a cost efficient and indeed a full power CPU engine, where for cost reasons, maybe the majority of z/OS customers don’t deploy the “highest” derivative of General Purpose CPU engine available to them.  On the face of it, the realm of possibility exists for other workloads to benefit from z/OS to IFL CPU offload, following sort, which seems to make sense as the first workload to utilize this solution.

z/OS Soft Capping: Balancing Cost & Performance

Historically each and every LPAR was assigned a Relative Weight value; where a more meaningful description would be the initial processing weight. This relative weight value is used to determine which LPAR gains access to resources, where multiple LPARs are competing for the same resource. Being unit-less is one minor challenge of the relative weight value, meaning that it has no explicit CPU capacity or resource value. Typically installations would use a simple multiple of ten metric, most likely 1000, and allocate weights accordingly (E.g. 600=60%, 300=30%, 10=10%, et al). Therefore during periods of resource contention, PR/SM would allocate resources to the requisite LPAR, based upon its relative weight.

Using relative weight to classify all LPARs as equal, at least from a generic class viewpoint, does have some considerations; primarily differentiating between Production and Non-Production workloads. Restricting a workload to its relative weight share of resources is known as Hard Capping. This setting is typically used to restrict Non-Production (E.g. Test) environments to their allocated resource and is also useful for cost control (E.g. Outsourcers), knowing that the LPAR will never consume more than its allocated relative weight allowance.

Hard Capping behaviour changes dependent on the use of the HiperDispatch setting. When HiperDispatch is not chosen, capping is performed at the Logical CP level, where the goal is for each logical CP to receive its relative CP share, based on the relative weight setting. When HiperDispatch is active, vertical as opposed to horizontal CPU management applies. So, a High categorization dictates capping at 100% of the logical CP, whereas a Medium or Low setting allows for resource sharing based on a relative weight per CP basis.

The Intelligent Resource Director (IRD) function provides more advanced relative weight management, automating management of CPU resources and a subset of I/O resources. Workload Manager (WLM) manages physical CPU resource across z/OS images within an LPAR cluster based on service class goals. IRD is implemented as a collaboration between the WLM function and the PR/SM Logical Partitioning (LPAR) hypervisor:

  • Logical CP Management: dynamically allocating logical processors (E.g. Vary On-Line/Off-Line)
  • Relative Weight Management: dynamically redistributing CPU resource as per LPAR weights
  • CHPID Management: dynamically assigning logical channel paths between eligible LPARs

IRD optimizes resource usage, enabling WLM to deliver workload goals.

The use of relative weight in association with Hard Capping and/or IRD/WLM granularity has become somewhat limited for most Mainframe installations with the advent of Sub-Capacity pricing (I.E. MLC via SCRT/R4HA). Primarily because there is no direct correlation to manage CPU resource at a meaningful level, namely the MSU (vis-à-vis CPU MIPS) metric.

Defined Capacity (DC) provides Sub-Capacity CEC pricing by allowing definition of LPAR capacity with a granularity of 1 MSU. In conjunction with the WLM function, the Defined Capacity of an LPAR dictates whether Soft Capping is invoked or not. At this juncture, we should consider how and when WLM measures CPU resource usage and if and when Soft Capping is activated and deactivated:

WLM is responsible for taking MSU utilization samples for each LPAR in 10-second intervals. Every 5 minutes, WLM documents the highest observed MSU sample value from the 10-second interval samples. This process always keeps track of the past 48 updates taken for each LPAR. When the 49th reading is taken, the 1st reading is deleted, and so on. These 48 values continually represent a total of 5 minutes * 48 readings = 240 minutes or the past 4 hours (I.E. R4HA). WLM stores the average of these 48 values in the WLM control block RCT.RCTLACS. Each time RMF (or BMC CMF equivalent) creates a Type 70 record, the SMF70LAC field represents the average of all 48 MSU values for the respective LPAR a particular Type 70 record represents. Hence, we have the “Rolling 4 Hour Average”. RMF gets the value populated in SMF70LAC from RCT.RCTLACS at the time the record is created.

SCRT also uses the Type 70 field SMF70WLA to ensure that the values recorded in SMF70LAC do not exceed the maximum available MSU capacity assigned to an LPAR. If this ever happens (due to Soft Capping or otherwise) SCRT uses the value in SMF70WLA instead of SMF70LAC. Values in SMF70WLA represent the total capacity available to the LPAR.

We should also consider the two possibilities for MLC software payment (I.E. SCRT) based upon MSU resource usage. Quite simply, the MSU value passed for SCRT invoice consideration is the R4HA or the Defined Capacity, whichever is the lowest. Put another way; if the R4HA exceeds Defined Capacity, Soft Capping applies to the LPAR.

The primary disadvantage of Soft Capping is that the Defined Capacity setting is somewhat static; it is manually defined once, maybe several times a day for workloads with distinct characteristics (E.g. On-Line, Batch, et al), but dynamic DC management based upon inter-related LPAR behaviour is at best, evolving. The primary considerations for Soft Capping are:

  • An LPAR can only be managed via Soft Capping or Hard Capping; not both
  • DC rules only applies to General Purpose CP’s (Hard Capping for Specialty Engines is allowed)
  • An LPAR must be defined with shared CP’s (dedicated CP’s not allowed)
  • All LPAR Sub-Capacity eligible products have the same MSU capacity (I.E. DC)

Soft Capping is relatively simple to implement and typically generates MLC software costs savings, with minimal impact.

Group Capacity Limit (GCL) provides an extension to the Defined Capacity (DC) Soft Capping function. GCL allows an MSU limit for total usage of all group LPARs, with a granularity of 1 MSU. The primary considerations for GCL are:

  • Works with DC LPAR capacity settings
  • Target share does not exceed DC
  • Works with IRD
  • Multiple CEC groups allowed; but an LPAR may only be defined to one group
    An LPAR must be defined with shared CP’s, with WAIT COMPLETION = NO specification

It is possible to combine IRD weight management with the GCL function. Based on installation policy, IRD can modify the relative weight setting to redistribute capacity resource within an LPAR cluster.

However, IRD weight management is suspended when GCL is in effect, because LPAR resource entitlement within a capacity group can be (I.E. Pre zxC12) derived from the current weight. Hence the LPAR might get allocated an unacceptable low weight setting, generating a low GCL entitlement.

GCL also allows for MSU to be shared between LPARs in a group, where one LPAR would be a donator and another would be a receiver. Therefore the customer classifies their LPARs accordingly and when a high-priority LPAR requires additional MSU resource, it will be allocated from a lower priority LPAR, if available. This provides a modicum of flexibility, but by definition, peak workloads are not predictable and typically require a significantly higher amount of MSU for a short time period. Typically this requirement will not be satisfied with the GCL function.

Soft Capping techniques, either at the individual (DC) or group (GCL) level deliver cost saving benefit, but a fine granularity of management is required to balance cost saving versus associated performance considerations. The primary challenges associated with Soft Capping are its interactions with workload characteristics and an inability to dynamically manage MSU allocation, in-line with the R4HA. Put another way, the R4HA is derived from 48*5 Minute samples, whereas DC and GCL settings are typically defined on an infrequent (E.g. Monthly or longer) basis.

As z/OS evolves, further in-built function is available to manage MSU capacity. zSeries Capacity Provisioning Manager (CPM) is designed to simplify the management of temporary capacity, defined capacity and group capacity. The scope of z/OS Capacity Provisioning is to address capacity requirements for relatively short term workload fluctuations for which On/Off Capacity on Demand or Soft Capping changes are applicable. CPM is not a replacement for the customer derived Capacity Management process. Capacity Provisioning should not be used for providing additional capacity to systems that have Hard Capping (initial capping or absolute capping) defined.

With the introduction of z/OS 2.1, CPM functionality incorporates Soft Capping support via the DC and GCL functions. CPM functions from a set of installation defined policies and parameters, where the CPM server receives three types of input:

  • Domain Configuration: defines the CPCs and z/OS systems to be managed
  • Policy: contains the information as to which work is eligible, for which conditions and during which timeframes and capacity increases for constrained workloads
  • Parameter: contains environment descriptors (E.g. UNIX Environment, Installation Options, et al)

From a customer viewpoint, policy definition allows them to define the provision of CPU resource:

  • Date & Time: When capacity provisioning is allowed
  • Workload: Which service class qualifies for provisioning?
  • CPU Resource: How much additional MSU capacity can be allocated?

CPM provides more function when compared with Defined Capacity and Group Capacity Limit Soft Capping techniques. Therefore allowing for time schedules to be defined, workloads to be categorized and MSU resource to be allocated in a dynamic and granular manner.

A modicum of complexity exists when considering the arguably most important factor for CPM policy definition, namely the Performance Index (PI):

  • Activation: PI of service class periods must exceed the activation threshold for a specified duration, before the work is considered as eligible.
  • Deactivation: PI of service class periods must fall below the deactivation threshold for a specified duration, before the work is considered as ineligible.
  • Null: If no workload condition is specified a scheduled activation/deactivation is performed; with full capacity as specified in the rule scope, unconditionally at the start and end times of the time condition.

For workload based provisioning it is a necessary condition that the current system Performance Index exceeds the specified customer policy PI metric. One must draw one’s own conclusions regarding PI criteria settings, but to date, they’re largely based on arguably complex mathematical formulae, which perhaps is not practicable, especially from a simple management viewpoint.

With the requisite hardware (I.E. zxC12+) and Operating System levels (I.E. z/OS 1.13+), CPM provides extra functionality for the customer to implement granular Soft Capping techniques to balance cost and performance. When compared with Defined Capacity and Group Capacity Limit techniques, CPM delivers increased granularity for managing capacity dynamically, based on customer derived policies, recognizing time slots, workloads and MSU resource increases accordingly.

From a big picture viewpoint, without doubt, we must recognize the fundamental role that WLM plays in Soft Capping. Quite simply, the 48*5 Minute MSU resource samples dictate whether a workload will be eligible for Soft Capping or not and from a cumulative viewpoint, these MSU samples dictate the R4HA metric. Based on this observation, efficient and functional Soft Capping must be workload based (I.E. WLM Service Class), be dynamic and operational on a 24*7 basis, because workload peaks are never predictable, while balancing MSU resource accordingly. Of course, simplicity of implementation and management, supplemented by meaningful reporting is mandatory.

Once again, observing the 48*5 Minute MSU resource samples from a R4HA viewpoint, if a workload was to increase MSU usage by an average of 50% for 1 Hour (I.E. 12 Samples), and decrease MSU usage by an average of 20% for 2.5 Hours (I.E. 30 Samples), from an average viewpoint, the R4HA has remained static. Therefore an optimum Soft Capping technique needs to recognize WLM service class requirements, reacting in a timely manner, increasing and decreasing MSU usage, to safeguard workload performance for Time Critical workloads, while optimizing SCRT MLC cost.

zDynaCap delivers automated capacity balancing within CPCs, Capacity Groups or Groups of LPARs. Central to zDynaCap are the predefined balancing policies. Within these balancing policies, users define their MSU ranges of Groups and LPARs and also the priorities of the associated LPAR Workload. zDynaCap continually monitors overall usage and compares this to the available capacity and the user defined MSU balancing policies. For example, should a high priority workload on one LPAR not get enough capacity, while a low priority workload on another within the group gets too much capacity, available MSU capacity is distributed according to customer derived balancing policies. Only if there is no leftover capacity to be rescheduled within the defined Group, and if the high or medium priority workload will be slowed down, will zDynaCap add MSU.

With zDynaCap Capacity Balancing, available MSU capacity is balanced within LPAR groups, safeguarding that during peak time the mission critical workload is processed as per business expectations (E.g. SLA/KPI) for the lowest possible MLC cost.

In conclusion, given the significance of IBM MLC software (E.g. z/OS, CICS, DB2, IMS, WebSphere MQ, et al) costs, arguably every Mainframe environment should deploy a capping technique for cost optimization. Hard Capping might work for some, but in all likelihood, Soft Capping is the primary choice for most Mainframe environments. For sure, IBM have delivered several Soft Capping techniques, with varying levels of function and granularity, namely Defined Capacity, Group Capacity Limit (GCL) and the zSeries Capacity Provisioning Manager (CPM). It was forever thus and the ISV community exists because they specialize, architect and deliver specialized solutions and zDynaCap is such a solution, recognizing the fundamental rules of IBM Mainframe Soft Capping, namely the underlying WLM and R4HA foundation.

zIIP Into The Future: Mainframe Specialty Engines Evolution

Sometimes we might lose sight that change can be evolutionary as opposed to revolutionary and this certainly applies to IBM Mainframe specialty engines, for example:

  • 1997: Internal Coupling Facility (ICF)
  • 2000: Integrated Facility for Linux (IFL)
  • 2004: System z Application Assist Processor (zAAP)
  • 2006: System z Integrated Information Processor (zIIP)

To assist with lower IBM software pricing, arguably the ICF offering became the de facto standard for a Mainframe user to be considered “actively coupled”.  Therefore deploying two or more eligible IBM Mainframes, physically attached via coupling links to a common Coupling Facility (I.E. ICF).

The Integrated Facility for Linux (IFL) is a processor dedicated to Linux workloads on IBM System z servers.  The IFL is supported by the z/VM virtualization software and the Linux operating system.  Most customers have at least dabbled into this technology, while some are using this technology extensively, primarily for distributed server consolidation.

Somehow the zAAP specialty engine has become the “black sheep” of the family where the current zEC12 and zBC12 are planned to be the last System z servers to offer support for zAAP specialty engine processors.

As of z/OS V1.11, functionality was delivered enabling zAAP eligible workloads to run on zIIP engines.  This function allowed both zIIP & zAAP-eligible workloads to process on zIIP.  This capability was ideal for customers with insufficient zAAP or zIIP eligible workload to justify a specialty engine.  Whereas the combined eligible workloads increase the ROI metrics for zIIP deployment.  The zAAP specialty engine is primarily targeted for web-based applications and SOA-based technologies, namely Java and XML.

So for z/OS type workloads, we must “zIIP Into The Future”…

Sometimes we need to look at the big picture, where the IBM organization is comprised of many business units, including the Mainframe business unit.  The Mainframe business unit itself contains many groups, including, but not limited to, the Hardware and Software groups.

As we all know, z/OS software TCO is significant and so this translates into higher revenues for the IBM Mainframe software group; but what about the IBM Mainframe hardware group?  Perhaps the specialty engines, primarily in the form of zIIP will generate revenue stream for this business unit.  Along with the introduction of zBC12 & zEC12 servers, IBM increased the zIIP to General Purpose (CP) engines ratio to 2:1; meaning you can have 2 zIIP specialty engines with the same capacity as an associated CP engine.  Previously the maximum ratio allowed was 1:1 (Specialty:CP).

What workloads are zIIP eligible?  Over time and since 2006 the amount of workload that is zIIP eligible has increased, primarily due to software development and upgrade efforts of IBM and the 3rd party ISV community:

  • DB2 for z/OS exploits the zIIP capability for portions of eligible data serving, pureXML and utility workloads
  • Other 3rd party DBMS solutions, including ADABAS & IDMS offload workload to zIIP
  • Most Systems Management tools (E.g. OMEGAMON, MAINVIEW, RMF, SYSVIEW, et al)
  • z/OS XML System Services for eligible XML validating and non-validating workloads
  • Other z/OS functions including /OS Communications Server, Global Mirror, CIM Server, et al

What are the benefits of deploying a zIIP specialty engine?

  • Lower acquisition and maintenance costs, when compared with general CP
  • zIIP engines run at full rated CP speed
  • Offload work (CPU) from General Purpose (CP) engines
  • No cost for Sub-Capacity eligible IBM software (I.E. WLC)

So, one must draw one’s own conclusions, but seemingly the deployment of zIIP engines is a “no brainer”!

Hmmm, once again, evolution is a good thing and the zIIP engine has an 8 year history and its predecessor zAAP, a 10 year history.  This ~10 year period has allowed for user experiences and IBM function developments to evolve a more stable and rounded offering and as previously stated, a product for the IBM Mainframe Hardware group to focus upon.

From a customer viewpoint, zIIP deployment requires a Capacity Planning evolution, which should be reasonably straightforward.  The big difference is the CP to zIIP offload consideration and some of the lessons learned include:

  • Software costs – Multiple-Processors; CP to zIIP Offload Rate; zIIP utilization
  • Hardware costs – Installed Books (total MSU/MIPS capacity); Additional LPAR(s)
  • Peak CPU utilization – Safeguard that zIIP exploitation reduces peak CPU usage
  • CPU per Transaction – Slight increase in CPU (not necessarily elapsed time) as workload switches from CP to zIIP
  • zIIP utilization – Early experiences indicate ~50% zIIP engine busy is a good number

In conclusion, zIIP deployment has been gradual and evolutionary, but many factors indicate that zIIP is here to stay and it is the future.  Seemingly from an IBM viewpoint, with benefit for the Mainframe Hardware Group in terms of the eradication of the zAAP engine, the increase in CP:zIIP ratio to 2:1 and the associated customer benefits of Sub-Capacity software pricing.  From a customer viewpoint, ignoring these pointers might not be wise, as z/OS software costs are significant and CPU resource requirements keep increasing.  Adding extra zIIP CPU capacity reduces hardware and associated software costs and so this is the “no brainer” observation that can’t be ignored for much longer…

Mainframe Server Planning: Vendor Interaction

In the last few weeks I have encountered a couple of scenarios regarding Mainframe Server upgrades that have surprised me somewhat.  The first was at the annual UK GSE conference during November 2013, where one of the largest UK Mainframe customers stated “we had problems regarding the capacity sizing of the IBM Mainframe server installed and our vendor was not very helpful in resolving this challenge with us”.  The second was a European customer with 2 aging servers deployed, z9 BC, and they had asked their IBM Mainframe server vendor to provide an upgrade quotation.  The server vendor duly replied, providing a like-for-like upgrade quotation, 2 new zBC12 servers, which at first glance seemed to be a valid configuration.

The one thing in common for these 2 vastly different Mainframe customers, the first very large, the second quite small, is that inadvertently they didn’t necessarily engage their respective vendors with the best set of questions or indeed terms of reference; while the vendors might say “ask me no questions and I’ll tell you no lies”…

For the 2nd scenario, I was asked to quickly review the configuration provided.  My first observation was to consolidate both workloads on 1 server.  The customer confirmed, there was no business reason to have 2 servers, it was historic, and there wasn’t even a SYSPLEX between the 2 z9 BC servers.  The historic reason for the 2 z9 BC servers was the number of General Purpose (GP) engines supported.  My second observation was that software licensing could be simplified and optimized with aggregated MSU and use of the AEWLC pricing model.  So within ~1 hour, the customer had a significant potential to dramatically reduce costs.

We then suggested an analysis of their configuration with 2 software products, PerfTechPro for z/OS and zDynaCap.  They already had the SMF data, so using the simulation abilities of these products, the customer quickly confirmed they could consolidate their workloads onto 1 zBC12, deploy zIIP processors to offload ~15% CPU usage from GP, and control MSU allocation with zDynaCap, saving another ~10% of CPU.  For this customer, a small investment in software products reduced their server upgrade costs by ~€400,000 in year 1, with similar software savings, each and every year forever more.  Although they didn’t have the skills in-house from a Mainframe Capacity Planning and software licensing viewpoint, this customer did eventually ask the right questions, and the rest as they say is history!

No man or indeed Mainframe customer is an island, so don’t be afraid to ask questions of your vendors or business partners!

From a cost viewpoint, both long-term (TCO) and day 1 (TCA), the requirement to deploy the optimum Mainframe server configuration from a capacity viewpoint cannot be under estimated, both in terms of hardware costs, but more importantly, associated software costs.  It therefore follows that Mainframe Capacity Planning and Mainframe Software Licensing knowledge is imperative, but I’m not so sure there are that many Mainframe customers that have clearly defined job roles for such disciplines.

To generalize, always a dangerous thing, typically the larger Mainframe customer does have skilled and seasoned personnel for the Capacity Planning discipline, while the smaller Mainframe user might rely on a generic Systems Programmer or maybe even rely on their vendor to size their Mainframe servers.  From a Mainframe software licensing viewpoint, there seems to be no general rule-of-thumb, as sometimes the smaller customer has significant knowledge and experience, whereas the larger Mainframe customer might not.  Bottom Line: If the Mainframe customer doesn’t allocate the optimum capacity and associated software licensing metrics for their installation, problems will arise, probably for several years or more!

Are there any simple solutions or processes that can assist Mainframe customers?

The first and most simple observation is to engage your vendor and safeguard that they generate the final Mainframe server configuration that is used for Purchase Order activities.  For sure, the customer will have their capacity plan and perhaps a “draft” server configuration, but even in these instances, the vendor should QA this data, refining the bill of materials (E.g. Hardware) accordingly.  Therefore an iterative process occurs between customer and vendor, but the vendor is the one that confirms the agreed configuration is fit for purpose.  In the unlikely event there are challenges in the future, the customer can work with their vendor to find a solution, as opposed to the example stated above where the vendor left their customer somewhat isolated.

The second observation is leverage from the tools and processes that are available, both generally available and internal for vendor pre sales personnel.  Seemingly everybody likes something for nothing and so the ability to deploy “free” tools will appeal to most.

For Mainframe Capacity Planning, in addition to the standard in-house processes, whether bespoke (E.g. SAS, MXG, MICS based) or a packaged product, there are other additional tools available, primarily from IBM:

zPCR (Processor Capacity Reference) is a generally available Windows PC based tool, designed to provide capacity planning insight for IBM System z processors running various z/OS, z/VM, z/VSE, Linux, zAware, and CFCC workload environments on partitioned hardware.  Capacity results are based on IBM’s most recently published LSPR data for z/OS.  Capacity is presented relative to a user-selected Reference-CPU, which may be assigned any capacity scaling-factor and metric.

zCP3000 (Performance Analysis and Capacity Planning) is an IBM internal tool, Windows PC based, designed to for performance analysis and capacity planning simulations for IBM System z processors, running various SCP and workload environments.  It can also be used to graphically analyse logically partitioned processors and DASD configurations.  Input normally comes from the customer’s system logs via a separate tool (I.E. z/OS SMF via CP2KEXTR, VM Monitor via CP3KVMXT, VSE CPUMON via VSE2EDF).

zPSG (Processor Selection Guide) is an IBM internal tool, Windows PC based, designed to provide sizing approximations for IBM System z processors intended to host a new application, implemented using popular, commercially available software products (E.g. WebSphere, DB2, ODM, Linux Apache Server).

zSoftCap (Software Migration Capacity Planning Aid) is a generally available Windows PC based tool, designed to assess the effect on IBM System z processor capacity, when planning to upgrade to a more current operating system version and/or major subsystems versions (E.g. Batch, CICS, DB2, IMS, Web and System).  zSoftCap assumes that the hardware configuration remains constant while the software version or release changes.  The capacity implication of an upgrade for the software components can be assessed independently or in any combination.

zBNA (System z Batch Network Analysis) is a generally available Windows PC based tool, designed to understand the batch window, for example:

  • Perform “what if” analysis and estimate the CPU upgrade effect on batch window
  • Identify job time sequences based on a graphical view
  • Filter jobs by attributes like CPU time / intensity, job class, service class, et al
  • Review the resource consumption of all the batch jobs
  • Drill down to the individual steps to see the resource usage
  • Identify candidate jobs for running on different processors
  • Identify jobs with speed of engine concerns (top tasks %)

BWATOOL (Batch Workload Analysis Tool) is an IBM internal tool, Windows PC based, designed to analyse SMF type 30 and 70 data, producing a report showing how long batch jobs run on the currently installed processor.  Both CPU time and elapsed time are reported. Similar results can then be projected for any IBM System z processor model. Basic questions that can be answered by BWATOOL include:

  • What jobs are good candidates for running on any given processor?
  • How much would jobs benefit from running on a faster processor?
  • For jobs within a critical path (batch window), what overall change in elapsed time might occur with a new processor?

zMCAT (Migration Capacity Analysis Tool) is an IBM internal tool, Windows PC based, designed to compare the performance of production workloads before and after migration of the system image to a new processor, even if the number of engines on the processor has changed.  Workloads for which performance is to be analysed must be carefully chosen because the power comparison may vary considerably due to differing use of system services, I/O rate, instruction mix, storage reference patterns, et al.  This is why customer experiences are unique from an internal throughput ratio (ITRR) based on LSPR benchmark data.

zTPM (Tivoli Performance Modeler) is an IBM internal tool, Windows PC based designed to let you build a model of a z/OS based IBM System z processor, and then run various “what if scenarios”.  zTPM uses simulation techniques to let you model the impact of changes on individual workload performance.  zTPM uses RMF or CMF reports as input.  Based on these reports, zTPM can create summary charts showing LPAR as well as workload utilization.  An automated Build function lets you build a model that represents the system for any reporting interval.  Once the model is built, you can make changes to see the impact on workload performance.  zTPM is also available as an IBM software product offering.

Therefore there are numerous tools available from IBM to assist their customers determine optimum Mainframe server capacity requirements.  Some of these tools are generally available without engaging the IBM account team, but others are internal to IBM, and for that reason alone, Mainframe customers must engage their IBM Mainframe account team to participate in their capacity planning activities.  Additionally, as the only supplier of Mainframe Servers, IBM have a wealth of knowledge and indeed a responsibility and generally a willingness to assist their customers deploy the right Mainframe server configuration from day 1.

As a customer, don’t be afraid to engage external 3rd parties to perform a sanity check of your thinking and activities, clearly IBM as they will be fulfilling your IBM Mainframe server order.  However, consider engaging other capacity/performance and software licensing specialists as their experience incorporates many customers, as opposed to an insular view.  Moreover, such 3rd parties probably utilize their own software tools or products to assist in this most important of disciplines.

In conclusion, as always, the worst question is the one not asked, and for this most fundamental of processes, not collaborating with your vendor and the wider community, might leave you as an individual exposed and isolated, and your company exposed to the consequences of an undersized or oversized Mainframe sever configuration…

Mainframe ISV Software: Is Continuous Product Improvement Always Evident?

Ken Venturi once said “I don’t believe you have to be better than everybody else.  I believe you have to be better than you ever thought you could be”.

Wouldn’t it be great if every CTO and/or Product manager had this same philosophy for their Mainframe software solution?  One such example I have experienced over the years is (E)JES from Phoenix Software International (PSI).  Of course it’s really important to have Day 1 support for the latest release of Operating System, z/OS 2.1 being the latest example, but what about actually exploiting the latest functionality available with the latest zSeries Mainframe Enterprise Servers and z/OS Operating Systems?

To drive maximum bang from you’re your buck, optimal performance and robust cost optimization can only be possible by recognizing and exploiting the latest Mainframe function ASAP, as and when appropriate.  Furthermore, listening to your customers, analysing their feedback, actively participating in User Organizations such as SHARE, and so on, will all help in continuous product development and innovation.

Here are some of the reasons why (E)JES has succeeded over a 30+ year period, recognizing and exploiting new z/OS function, as and when the updated z/OS is released for General Availability (GA).  Even today, with Version 5.3 supporting z/OS 2.1 as of day 1, (E)JES continues to offer value-added function for the seasoned, inexperienced and in fact, all IBM Mainframe technicians:

  • 64-bit performance optimizations (I.E. MEMLIMIT: above-the-bar) for both (E)JES client and server components, safeguarding minimal z/OS resource usage.
  • Nearly all (E)JES JES subsystem processing routines are eligible for zIIP redirection, delivering software cost savings for all (E)JES users.  Sub-Capacity System z processor users experience improved (E)JES performance because zIIP engines always run at full speed.  This behaviour differs from that of General Purpose CPs, “throttled” with Sub-Capacity deployments.
  • (E)JES code executes faster via its inbuilt High Performance Routine (HPR) facility, specifically developed to make (E)JES code execute faster while accessing data in JES control blocks.  HPRs have a shorter instruction path length than previous coding techniques, avoiding delays in modern z Series CPU instruction pipelines.
  • If High Performance FICON (zHPF) is available, (E)JES uses Transport Mode channel programs for JES Spool I/O.  When zHPF is not available, or when a CAS server performs I/O against the global data set, (E)JES uses the highest-performing Command Mode channel programs currently available.  These channel programs perform I/O significantly faster than “ordinary” channel programs.
  • The use of 24-bit (captured) UCBs puts a strain on the 24-bit virtual storage resource.  The use of ordinary (non-extended) TIOT entries puts a limit on the total number of allocations that can exist simultaneously in an address space.  (E)JES supports and uses 31-bit (uncaptured) UCBs and the extended TIOT (XTIOT) function (I.E. NON_VSAM_XTIOT=YES in DEVSUPxx PARMLIB)
  • (E)JES supports placement of JES spool data sets in the cylinder-managed area of an Extended Address Volume (EAV).  Of course, as of z/OS 1.12, EAV increases 3390 DASD capacity to ~1 TB.
  • (E)JES Pattern Utility Matching uses the SRST hardware instruction.  Empirical measurements show this technique is far faster on modern System z processors than alternatives such as the TRT instruction or “brute force” matching techniques using CLI/CLC.

One of the primary benefits of upgrading IBM z/OS software is the overall system performance benefit and associated cost reduction, but of course, IBM can only deliver the function and ability, while it’s incumbent upon the ISV community to upgrade their software products accordingly.  A key goal for any good ISV software product is to try to provide a value-add in the area of performance.  This has been one of the primary areas of focus for (E)JES since its introduction in 1978. 

Most spool display and management products tend to rely on the most resource-intensive interface available, namely the JES subsystem provided SSI 80.  (E)JES benchmarking tests against the most readily-available JES SSI 80 exploiters demonstrates significant CPU savings when deploying (E)JES.

Software products also need to deliver continuous improvements with regard to usability, presentation and in-built function, increasing user and system administrator productivity.  Without doubt, optimization encompasses not just hardware, but software, services, systems management disciplines and “best practices” that tie it all together.  Here are some of the usability enhancements that (E)JES has incorporated:

  • ISPF users running a 3270 emulator on a programmable workstation can now search IBM Eclipse-based InfoCenters via (E)JES.  Although (E)JES fully supports BookManager format documentation, BookManager READ/MVS is now obsolete, beginning with z/OS 2.1, BookManager softcopy books are no longer delivered by IBM.  IBM has stated that InfoCenters, and eventually KnowledgeCenters, are their strategic direction for online documentation.
  • (E)JES Web is a new, browser-based interface to (E)JES.  The associated RESTful API delivering this web enabled technology provides a framework for the creation of Eclipse plug-ins, mobile applications, and other web services clients.  This facility will provide a “rapid learning” type facility for users (E)JES users, both new and old that might be uncomfortable navigating traditional 3270 interfaces.
  • (E)JES provides a Java Application Programming Interface (API), complementing other in-built APIs for REXX and procedural languages.  By using an (E)JES API, a user can harness the versatility of their preferred programming language to interface and interact with (E)JES.  This support provides an interface to deliver nearly all of the capabilities available to an interactive (E)JES user.
  • (E)JES incorporates context sensitive help function, with point-and-shoot/pop-up dialogs, helping educate users on (E)JES, JES and z/OS while they work.  Users can get pop-up explanations of columns, input choices for unprotected fields, and a list of line commands.  Smart pop-ups explain the contents of certain columns, such as system abend codes.

The latest (E)JES Release Information Manual eloquently details the product enhancements over the last 5 releases or so, providing a good Product Roadmap reference point.

So, whether the ISV software product you deploy has been available for several years or several decades, do you safeguard maximum business benefit for optimal cost by considering:

  • Does the ISV deploy the latest zSeries server (I.E. zBC12, zEC12) for software interoperability and full hardware function exploitation; or an emulation (I.E. zPDT) technique?
  • Does the ISV deliver value-added z/OS related function on Day 1 or even within a year of the latest z/OS release?
  • Does the ISV deliver meaningful function to assist your users deploy said function, while simplifying environment management for system administrators?
  • Does your ISV product optimize cost, with Sub-Capacity pricing in MSU increments, aggregated MSU costs for your entire zSeries Mainframe environment, as opposed to specific workloads (E.g. CPC’s, LPAR’s, et al)?
  • Does your ISV product optimize cost by offloading the majority of its CPU function to zIIP specialty engines, which run at maximum speed, and where software “runs for free”?

Of course, only you can ask and potentially answer these questions during your day-to-day activities of maintaining currency and optimal performance for your Mainframe software portfolio.

Sometimes the hardest questions anybody can ask are the questions they ask themselves, which are never rhetorical questions!  Extracted verbatim from the latest (E)JES Release Information Manual:

Team (E)JES took advantage of the Phoenix Software International zHISR performance analysis product to discover performance “hot spots” in  the (E)JES product.  Sometimes the simplest, least conspicuous piece of code turns out to be a major CPU contributor.  See below for some of the most embarrassing “surprise” hot spots we discovered using zHISR in a z/OS 2.1 LPAR:

  • Over 30% of the CPU used during a Spool Data Browse FIND operation, against a multi-million-line SYSOUT in JES2, turned out to be code that was clearing a record buffer to blanks using MVCL.  This clearing code was eliminated and some minor adjustments were made in other code to compensate for this change.
  • 27% of the CPU used to produce the Activity display in JES2 turned out to be in a routine that manages an internal resource called the “Job Positions Table.”  The algorithm was improved (to work more like its JES3 counterpart) and that routine is no longer a significant CPU contributor.
  • 9% of (E)JES session start-up was a 26-year-old “brute force” prime number generator used to compute the size of a hash table.  That code was totally reworked and now accounts for approximately .02% of session start-up CPU.
  • A 6% performance penalty was observed when sorting a tabular display with a moderate number of rows. The hot spot turned out to be the code that cleared the work area for the sort service to zeros (another MVCL). This overhead was reduced to .04%.

Mea culpa and humility, never a bad thing, but you have to be honest with yourself and ask yourself the right questions!  So going back full circle and quoting Ken Venturi once again, “I don’t believe you have to be better than everybody else.  I believe you have to be better than you ever thought you could be”.  You must draw your own conclusions as to whether such an observation applies to the (E)JES team at Phoenix Software International (PSI)…

Why not ask them yourself?  Ed Jaffe, the (E)JES CTO will be available at the forthcoming UK GSE Annual Conference, 5-6 November 2013, speaking about (E)JES System Management Software: More With Less For Less, For The z/OS Mainframe and z/OS 2.1 User Experiences.

21st Century Mainframe Capacity Planning Requirements

With nearly 5 decades of longevity the IBM Mainframe has changed beyond recognition in terms of CPU capacity and performance capability.  The Capacity Planning discipline for the IBM Mainframe server became more advanced and proactive in the early 1990’s, perhaps coinciding with the introduction of Parallel Sysplex structures associated with the MVS/ESA operating system.  Therefore the requirement to measure and model the impact of workload movement between LPAR and CPC structures became important, if not mandatory.

The fundamental building-block for Mainframe CPU usage analysis is SMF Type 7n records (I.E. RMF or CMF), where this data was typically processed by MXG, MICS and maybe CIMS (acquired by IBM), generally using SAS for reporting purposes.  Other tools, including but not limited to, BEST/1 (acquired by BMC) and PERFMAN (acquired by ASG) also offered capacity planning and performance management solutions.  Therefore, for 20+ years the fundamental Mainframe CPU usage data and associated tools have remained largely the same.  However, maybe the IBM Mainframe server has changed, both in terms of underlying CPU chip technology and customer workload deployment…

I often hear capacity planners state something along the lines of “I can report on the past with 100% accuracy, but predicting the future might prove to be a little more difficult”!  Once again, going back to the early 1990’s, the IBM Mainframe had a typical if not generic workload profile deployment, namely On-Line Transaction Processing (E.g. CICS, IMS DC) and related Database Management Subsystems (E.g. DB2, IMS DB) with Batch Processing.  This somewhat limited workload profile simplified the Capacity Planning process, applying estimates of growth based on current usage.  However, when the Mainframe became more pervasive, taking on new workloads, how was the capacity planner supposed to estimate CPU requirements for their new business application workload?

IBM introduced the Large Systems Performance Reference (LSPR) methodology, designed to provide relative processor capacity data for IBM System/370, System/390 and z/Architecture processors.  All LSPR data is based on a set of measured benchmarks and analysis, covering a variety of System Control Program (SCP) and workload environments.  LSPR data is intended to be used to estimate the capacity expectation for a production workload when considering a move to a new processor.  Although LSPR data is provided on an “as is” basis, with no warranty, it at least provides the Mainframe Capacity Planner with some insight into their CPU sizing challenge.  For many years, LSPR provided the only other data source, as well as RMF (CMF) for Mainframe CPU sizing.  However, is there a more accurate data source, perhaps based on real-life customer data?

With the introduction of the IBM System z10 server (February 2008), a new function CPU MF (CPU Measurement Facility) was incorporated.  Let’s not forget, z10 is now an n-2 technology, having been superseded by the z196/z114 and the latest zBC12/zEC12 generation of servers.  So each and every committed Mainframe customer should be positioned to benefit from the CPU MF function.

CPU MF provides optional hardware assisted collections of information about logical CPU activity executed over a specified interval in selected Logical Partitions (LPARs).  The CPU MF counters function is intended to be run on a constant basis to collect long-term performance data (I.E. SMF Record 113), in a similar manner to how you collect other performance data.  Therefore this data source can be deployed to further refine the accuracy of Mainframe CPU capacity planning projections.  Let’s not forget:

The primary on-going requirement for Mainframe Capacity Planning is to minimize any over or under capacity provision from forecast predictions, used for Mainframe server acquisition purposes”

Mainframe chip technology has also changed in complexity, especially with the latest iterations of CPU chips associated with the z10 server (E.g. POWER 6) onwards, incorporating many layers of cache memory.  Workload capacity performance will be quite sensitive to how deep into the memory hierarchy the processor must go to retrieve the workload’s instructions and data for execution.  Best performance occurs when the instructions and data are found in the cache(s) nearest the processor so that little time is spent waiting prior to execution; as instructions and data must be retrieved from farther out in the hierarchy, the processor spends more time waiting for their arrival.

As workloads are moved between processors with different memory hierarchy designs, performance will vary as the average time to retrieve instructions and data from within the memory hierarchy will vary.  Additionally, once on a processor this component will continue to vary significantly as the location of a workload’s instructions and data within the memory hierarchy is affected by many factors including; locality of reference, IO rate, competition from other resources (E.g. Applications, LPARs, et al), and so on…

The most performance sensitive area of the memory hierarchy is the activity to the memory nest, namely, the distribution of activity to the shared caches and memory.  IBM introduced new terminology, namely Relative Nest Intensity (RNI), indicating the level of activity to this part of the memory hierarchy.  Using data from CPU MF, the RNI of the workload running in an LPAR may be calculated.  The higher the RNI, the deeper into the memory hierarchy the processor must go to retrieve the instructions and data for that workload.

Therefore the Mainframe Capacity Planner does have various data sources available to forecast how an existing or new workload might perform on an upgraded processor (CPC), further refining their CPU capacity requirement forecast.  As always, the final stage in a Mainframe Capacity Planning process is to input the forecast data into the IBM Processor Capacity Reference (zPCR) tool, to determine the exact model and associated resource configuration options for their unique business workload mix.

To summarize, does your Mainframe Capacity Planning process incorporate all of these CPU sizing data sources, in an easy-to-use and cost efficient manner?

Founded by former IBM staffers and capacity planning and performance management industry veterans William Shelden, PhD, and William Hart, PerfTechPro is designed to deliver sophisticated, affordable, easy-to-use solutions for IT management professionals looking for fast, insightful help without high-cost, complex and time-consuming purchasing and licensing requirements.

PerfTechPro for z/OS is a Capacity Planning and Performance Measurement tool specifically designed for the cost conscious and savvy 21st Century data centre.  PerfTechPro for z/OS is the next evolution in Mainframe Capacity Planning tools, having been architected from ground zero using the latest techniques.  PerfTechPro for z/OS provides sophisticated capacity and performance management capabilities, affordable by any sized data centre:

  • Clean, intuitive, easy-to-use interface and graphical representations, for example:
    • Consolidated instance lists guide users to make informed selections
    • Descriptive dialog boxes detail your configuration
    • Anticipates, pre-loads data to speed retrieval, reporting and analysis
    • Automated data management
  • Forecasting and modelling
  • Non-proprietary database, enabling data use outside of PerfTechPro
  • Capable of automated collection, analysis and reporting of SMF 113 records produced by the IBM CPU Measurement Facility (CPU MF)
  • Supports measurement, management of zAAP & zIIP Specialty Engines
  • Automated analysis and management of all key capacity and performance metrics, for example:
    • GPP Utilization of All LPARs
    • MIPS Usage by CPU
    • DASD Response Times
    • Address Spaces Dispatched and Waiting 

PerfTechPro for z/OS also simplifies the data management process associated with Mainframe Capacity Planning.  Using a streamlined process on the z/OS host, PerfTechPro extracts and formats the data required from various SMF sources (E.g. SMF Type 7n, Type 113); delivering an optimized Performance Data Base (PDB) for use by the Windows based GUI.  This optimized file safeguards fast processing during the reporting and forecasting activities, while simplifying any data aggregation processes (E.g. Weekly, Monthly, et al).  Moreover, PerfTechPro allows this data to be stored in non-proprietary (E.g. Microsoft Access, SQL Server, MySQL, Oracle) and multiple database structures, as and if required.

PerfTechPro for z/OS is a simple-to-use and cost-efficient solution, allowing customers to quickly save time and money from their Capacity Planning and Performance Measurement solution.  Ultimately the bottom line objective for PerfTechPro for z/OS is to provide a best-of-breed solution for a very competitive cost. PerfTechPro for z/OS delivers business value by:

  • Ensuring enterprise zSeries Mainframe server resources are being used efficiently
  • Maximizing opportunities for cost-savings
  • Anticipating & responding to increased demand on resources
  • Reducing costs by exploiting periods of lower resource demand
  • Discerning underlying causes of performance and capacity issues
  • Eliminating time-consuming manual tracking, recording and analysis
  • Implementing disciplined management of valuable business resources

In conclusion, the Mainframe Capacity Planning process continues to evolve, forever striving to reduce any discrepancies in CPU requirements forecasting, which of course, have a high associated cost consideration.  Integrating CPU MF (SMF Type 113) must be a mandatory requirement, safeguarding that CPU Sizing, Forecasting, Modelling and Correlation Analysis activities are optimized.  Additionally, the actual process of Mainframe Capacity Planning is an activity that requires great skill and considerable associated responsibility.  A modern day solution such as PerfTechPro for z/OS is worthy of consideration, having been designed by a team with a heritage in delivering Mainframe Capacity Planning solutions, architecting function compatible with modern day functionality, while considering the latest technology zSeries CPU chip design considerations.

Application Performance Tuning – Why Bother?

With older generations of Mainframe Operating Systems, certainly MVS/XA and perhaps MVS/ESA, application performance tuning was a necessity, not an afterthought.  Quite simply, the cost of Mainframe resources, namely CPU, memory and disk, dictated that your mission critical business application might not perform to business requirements, unless you tuned your programming code.  Programmers, both of the system and application variety understood the bits and bytes of available programming languages (E.g. ASM, COBOL, PL/I) and Operating System (I.E. MVS), collaborating either via proactive process, or reactive problem solving.  With the continuing reduction of IT hardware component costs, the improvement in Operating Systems (E.g. 64-bit architecture) and newer programming languages (E.g. C, C++), it seems that application performing tuning is somewhat of an afterthought, but at what cost?

We all know that the cost of a Mainframe MIPS is significant, and although it might have reduced dramatically from a hardware viewpoint, from a software viewpoint, the cost remains largely static at ~£1,500-£3,500, per year, depending on your configuration.  So if your applications are burning several hundred if not several thousand extra MIPS unnecessarily, that’s very expensive indeed!  Additionally and just as importantly, a badly tuned system will manifest itself in slower transaction response times and longer batch jobs, if applicable, which could impact service availability.  So why is there a seeming reluctance to tune business applications, Mainframe resident or not?

If ever there was a functional IT area where the skills gap has never been wider, then application performance tuning is said skill, when comparing the salty old sea dog Mainframe dinosaur, with the newer Mainframe technician!

From an application development process viewpoint, where does the application performance tuning task live; before or after implementation?  The cynical amongst us will know; if it’s after implementation, there’s a strong likelihood said activity will never be performed!  If it’s before implementation, how many projects incorporate a meaningful stress test, or measure transaction response times versus an SLA or KPI metric?  Additionally, if the project is high-priority and/or running behind schedule, then performance testing is an activity that is easily removed…

Back in the good old days, the late 1980’s to early 1990’s, some application performance tuning tools did start to emerge, most notably Strobe.  Strobe was useful to even the most accomplished of system and application programmer personnel, and invaluable to less experienced personnel, and so arguably Strobe became the de facto software tool for tuning Mainframe applications.  However, later releases of MVS (E.g. OS/390 and z/OS), the non-event that was the Year 2000 (Y2K), seemed to remove the focus on and importance of application tuning.

Arguably most importantly of all, that software MIPS cost item, where Strobe and its competitors (E.g. ASG/BMC TriTune, CA Application Tuner, IBM APA, Macro4 ExpeTune, et al) will utilize even more CPU to capture diagnostic trace information, contributed to the demise of application performance tuning.  However, those companies that have undertaken such application tuning activities in the last decade or so are sitting pretty, having reduced the CPU (MIPS) resource consumed, lowering TCO and optimizing performance accordingly.  In the 21st Century, these software solutions are classified as Application Performance Management (APM) solutions.

Is there a better and easier way to stimulate an interest in the application performance tuning discipline?  If the desire exists to tune an application, lowering CPU MIPS usage, optimizing service performance, then the traditional tools and methods mentioned previously exist, but perhaps a new (or not so new) CPU performance data source exists…

With the introduction of the z10 server, a new function CPU MF (CPU Measurement Facility) was incorporated.  Let’s not forget, z10 is now an n-2 technology, having been superseded by the z196/z114 and the latest zBC12/zEC12 generation of servers.  So each and every committed Mainframe customer should be positioned to benefit from the CPU MF function.

CPU MF provides optional hardware assisted collections of information about logical CPU activity executed over a specified interval in selected Logical Partitions (LPARs).  The CPU MF counters function is intended to be run on a constant basis to collect long-term performance data (I.E. SMF Record 113), in a similar manner to how you collect other performance data.  I have previously briefly discussed how CPU MF SMF data can be used to increase Mainframe Server Capacity Planning efficiencies. 

The CPU MF sampling function is a short duration, precise function that identifies where CPU resources are being used, to help you improve application efficiency.  Put very simply, CPU MF sampling data has minimal CPU overhead (E.g. ~0.1-1.0%) when collecting data (I.E. z/OS Hardware Instrumentation Services – HIS), but this data can then be used to identify CPU “hot spots”, which can then be further analysed to identify the “areas of code” generating the high CPU usage.  However, it was forever thus, whether an APM tool, or CPU MF sampling data, high CPU usage can be identified, but the application programmer must undertake the task of optimizing the application code!

IBM have done a great job in providing CPU MF counters data, optimizing the Capacity Planning process with the SMF 113 record, and the realm of possibility exists with the sample data, but a software solution is required to analyse and summarize this data.

Currently there are very few if only one software solution that analyses CPU MF sample data, namely zHISR from Phoenix Software International.  zHISR interfaces directly with z/OS Hardware Instrumentation Services to collect data for hotspot analysis of customer, vendor, or operating system program execution.  zHISR features include:

  • Support for up to 128 simultaneous data collections events.  zHISR collections do not interfere with any HIS functions including sample or counter collection.
  • System console commands for many zHISR functions.
  • An Application Programming Interface to COBOL and Assembler for starting and stopping data collections. Collection lengths for API generated collections have a time range of one second or more.
  • Ability to schedule a collection with JCL so that collection starts when a given job or step begins.
  • Ability to store data collections as z/OS data sets or UNIX files.
  • Support for collections against CICS/TS transactions.
  • Analysis based on a time range within the collected data for a narrower spotlight on problem code.

An intuitive ISPF dialog allows the user to easily produce a CPU hot spots analysis, which can then be used for identifying the offending code sections.  The user can then drill down and highlight the high CPU CSECT and program offset (instruction), comparing with their Associated Data (ADATA), and thus the source programming instruction.  Therefore the skill required to perform analysis is minimal, as is the CPU overhead in collecting analysis data, and so eradicating the potential barriers when embarking on an application tuning initiative.  Furthermore, the actual cost of deploying the zHISR software is not onerous and so perhaps each and every committed Mainframe user can easily include application performance tuning into their application development lifecycle processes. 

zHISR has a UNIX file system interface that lets you navigate the system and browse or delete files.  With zHISR, users can start and stop hardware event data collections and view the status of the current or prior HIS run.  zHISR also includes a memory display/alter utility that lets you view main storage in the CPU you are logged on to.  If zIIPs are present and zHISR is defined as an authorized subsystem, nearly all of the CPU processing used by zHISR is redirected to a zIIP.

There are also instances, however few and far between, where Mainframe customers have written their own proprietary in-house OLTP (On-Line Transaction Processor) and Relational Database Management Subsystem (RDBMS), where traditional APM software tools can’t provide a solution, only interfacing with underlying subsystems (E.g. Adabas, CICS, DB2, IDMS, WebSphere, et al).  In these instances, CPU MF and zHISR offer a solution to help such customers, who probably face challenges when they upgrade their Mainframe servers, safeguarding software and application code is compatible with the new hardware, and ideally, exploits the latest functionality.

In conclusion, application performance tuning has to be a very important if not mandatory activity for the Mainframe Data Centre.  Whether via CPU MF or traditional APM software solutions, the cost reduction and performance improvement benefits of tuning should be compelling reasons to proactively engage in application tuning activities.  From a skills viewpoint, maybe the KISS (Keep It Simple Stupid) principle can apply, where CPU MF collects the data very simply and efficiently, complemented by zHISR, analysing the data in an intuitive and cost optimized manner.

So turning the subject matter on its head, Application Performance Tuning – Why Bother?  Why not!

Further information can be found from my z/OS Application Performance Tuning presentation, delivered at UK GSE in November 2012.

IBM Mainframe Capacity Planning & Software Cost Control Interaction?

The cost of IBM Mainframe software is an extensive subject matter that is multi-faceted and can generate much discussion. The importance of optimizing Mainframe software costs is without doubt, as it is the most significant Mainframe TCO component, having increased from ~25-50%+ of overall expenditure in the last decade or so. Conversely Mainframe server hardware costs have largely stabilized at ~15-25% of TCO in the same time period. However, Mainframe Capacity Planning activities have evolved over the last several decades or so, where hardware costs were the primary concern and the number of IBM Mainframe software pricing mechanisms was limited. Of course, in the last decade or so, IBM Mainframe software pricing mechanisms have evolved, with a plethora of acronyms, ESSO, ELA, IPLA, OIO, PSLC, WLC, VWLC, AWLC, IWP, naming but a few!

Can each and every IBM Mainframe user clearly articulate their Mainframe Capacity Planning and Software Cost Control policies, and which person in their organization performs these very important roles? Put another way, not forgetting Software Asset Management (SAM), should there be a Software Cost Control specialist for IBM Mainframe Data Centres…

If we consider the traditional Mainframe Capacity Planning role, put very simply, this process typically produces a 3-5 year rolling plan, based upon historical data and future capacity requirements. These requirements can then be modelled with the underlying hardware (E.g. z10, z114/z196, zEC12) server, identifying resource requirements accordingly, namely number of General Processors (GPs), Specialty Engines (E.g. zIIP, zAAP, IFL), Memory, Channels, et al. Previously, up until ~2005, customer requirements would be articulated to IBM, cross-referenced with LSPR (Large System Performance Reference) and an optimum hardware configuration derived. Since ~2005, IBM made their zPCR (Processor Capacity Reference) tool Generally Available, allowing the Mainframe customer to “more accurately” capacity plan for IBM zSeries servers.

Other enhancements to more accurately determine the ideal zSeries server include sizing based on actual customer usage data generated by the CPU MF facility introduced with the z10 server. CPU MF delivers a refinement when compared with LSPR, refining the zPCR process with real life customer usage data, compared to the standard simulated LSPR workloads.

In summary, the Mainframe Capacity Planning process has evolved to include new tools and data to refine the process, but primarily, the process remains the same, size the hardware based upon historical data and future business requirements. However, what about Mainframe software usage and therefore cost interaction?

Each and every IBM Mainframe user relies heavily on the IBM Operating System (I.E. z/OS, z/VM, z/VSE, zLinux, et al) and primary subsystems (I.E. CICS, DB2, MQ, IMS, et al). Some Mainframe users might deploy alternative database and transaction processing (TP) solutions, but a significant amount of Mainframe software cost is for IBM software products. In the late-1990’s, IBM introduced their PSLC (Parallel Sysplex License Charges), which offered lower aggregate (MSU) pricing for major IBM software products, based upon an eligible configuration (E.g. Resource Sharing). This pricing mechanism had no impact on software cost control, in fact quite the opposite; it was a significant cost benefit to implement PSLC!

In 2000 IBM announced Workload License Charges (WLC), which allowed users to pay for software based upon the workload size, as opposed to the capacity of the machine; thus the first signs of sub-capacity pricing. In 2001, the ability to deploy IBM eligible software on a “pay for what you use” basis was possible, as per the Variable Workload License Charge (VWLC) mechanism. Put very simply, a Rolling 4 Hour Average (R4HA) MSU metric applies for eligible IBM software products, where software is charged based upon the peak MSU usage during a calendar month. The Mainframe user pays for VWLC software based upon the R4HA or Defined Capacity (Sub-Capacity vis-à-vis Soft Capping), whichever is lowest.

From this point forward, and for the avoidance of doubt, for the last 10 years or so, there has been a mandatory requirement to consider the impact of IBM WLC software costs, when performing the Mainframe Capacity Planning activity. One must draw one’s own conclusions as to whether each and every Mainframe user has the skills to know the intricacies of the various software (E.g. IPLA, OIO, PSLC, WLC, et al) pricing models, when upgrading their zSeries server.

With the IBM Mainframe Charter in 2003, IBM stated that they would deliver a ~10% technology dividend benefit, loosely meaning that for each new Mainframe technology model (I.E. z9, z10), a lower MSU rating of 10% applied for the for the same system capacity level, when compared with the previous technology. Put another way, a potential ~10% software cost reduction for executing the same workload on a newer technology IBM Mainframe; so encouraging users to upgrade. However, the ~10% software cost reduction is subjective, because a higher installed MSU capacity dictates lower per MSU software costs…

With the introduction of the z196 and z114 Mainframe servers the technology dividend was delivered in the form of a new software license charge, AWLC (Advanced Workload License Charges), where lower software costs only applied if this new pricing model was deployed. A similar story for the zEC12 server, the AWLC pricing model is required to benefit from the lower software costs! If these software pricing evolutions were not enough, in 2011 IBM introduced the Integrated Workload Pricing (IWP) mechanism, offering potential for lower software pricing based upon workload type, namely a WebSphere eligible workload. Finally, and as previously alluded to, as MSU capacity increases, the related cost per MSU for software decreases, so there are many IBM software pricing mechanisms to consider when adding Mainframe CPU capacity. So once again, who is the IBM Mainframe Software Cost Control specialist in your organization?

For sure, each and every IBM Mainframe user will engage their IBM account team as and when they plan a Mainframe upgrade process, but how much “customer thinking is outsourced to IBM” during this process? Wouldn’t it be good if there was an internal “checks & balances” or due diligence activity that could verify and refine the Mainframe Capacity Plan with IBM software cost control intelligence?

Having travelled and worked in Europe for 20+ years, I know my peers, colleagues and friends that I have encountered can concur with my next observation. The English and Americans might come up with a good idea and perhaps product, the French are most likely to test that product to destruction and identify numerous new features, while the Germans will write the ultimate technical manual…

zCost Management are a French company that specializes in cost optimization services and solutions for the IBM Mainframe. From an IBM Mainframe Capacity Planning & Software Cost Control Interaction viewpoint, they have developed their CCP-Tool (Capacity and Cost Planning) software solution. This software product bridges the gap between Mainframe Capacity Planning for hardware and the impact on associated IBM software (E.g. WLC, IPLA, et al) costs.

CCP-Tool facilitates medium-term (E.g. 3-5 year) Mainframe Capacity Planning by controlling Monthly License Charges (MLC) evolution, generating cost control policies, optimizing zSeries (E.g. PR/SM) resource sharing, delivering financial management via IBM Mainframe software cost control activity. CCP-Tool integrates with existing data and activities, using SMF Type 70 & 89 records, defining events (I.E. Capacity Requirements, Workload Moves) in the plan, simulating many options, delivering your final capacity plan and periodically (I.E. Quarterly) reviewing and revising the plan. Most importantly, CCP-Tool deploys many algorithms and techniques aligned to IBM software pricing mechanisms, especially WLC and R4HA related.

Therefore CCP-Tool delivers a financial management framework via a medium-term Capacity Plan with associated software cost control and zSeries (E.g. PR/SM) resource policies. This enables a balanced viewpoint of future Data Centre cost configurations from both a hardware and related IBM Mainframe software viewpoint. Moreover, for those IBM Mainframe users that don’t necessarily have the skills to perform this level of Mainframe cost control, CCP-Tool delivers a low cost solution to empower the Mainframe customer to engage IBM on an equal footing, at least from a reporting viewpoint. Similarly, for those Mainframe users with good IBM Mainframe software cost control skills, CCP-Tool offers a “checks & balances” viewpoint, delivering that all important due diligence sanity check! Quite simply, CCP-Tool simplifies the process of reconciling the optimal configuration both from an IBM Mainframe hardware and related software viewpoint.

Without doubt, if a Mainframe user still deploys a hardware centric viewpoint of the capacity planning activity, without considering the numerous intricacies of IBM Mainframe software pricing, in most cases, this could be a significant cost oversight. Put very simply, a low-end IBM Mainframe user of ~150 MSU (1,000 MIPS) might spend ~£1,000,000 per annum, just for a minimal configuration of z/OS, CICS, COBOL and DB2 software, so one must draw one’s own conclusions regarding the potential cost savings, when deploying the optimal zSeries hardware and associated IBM software configuration. I paraphrase Oscar Wilde:

“The definition of a cynic is someone that knows the price of everything, and the value of nothing!”

So, let’s reprise. You have performed your Mainframe Capacity Planning activity, considered historical SMF data for CPU usage, maybe including the R4HA metric, factored in additional new and growth business requirements, refined the capacity plan by using the zPCR tool, perhaps with data input from CPU MF and you now have identified your optimum zSeries Mainframe server?

Maybe you should think again, because the numerous IBM MLC software pricing mechanisms could impact your tried and tested Mainframe CPU hardware planning process. Firstly, for MLC software, the unit cost per MSU reduces, as the installed MSU capacity increases. In simple terms, this encourages the use of “large container” processing entities, LPARs and CPCs. Other AWLC and IWP related considerations further encourages the use of major subsystems (E.g. CICS, DB2, WebSphere, IMS) in larger MSU capacity LPARs and CPCs to benefit from the lowest unit cost per MSU. Additionally, do you really need to run all software on all processing entities? For example, programming languages (E.g. COBOL, PL/I, HLASM, et al) are not necessarily required in all environments (E.g. Test, Development, Production, et al). It is not uncommon for compile and link-edit functions to be processed in Development environments only, while only run-time libraries are required for Production. These “what if” scenarios generated by the numerous IBM MLC software pricing mechanisms must be considered, ideally by an internal resource, with the requisite skills and experience.

Today, who is performing this Mainframe Software Cost Control in your organization? Is it an internal resource with the requisite skills, an independent 3rd party, IBM or nobody? One must draw one’s own conclusions as to whether any of these parties who could perform this vital activity has a vested interest or not, and thus a potential conflict of interests…