In the last few years I have heard a number of Mainframe decision maker’s state something like, “technology advances dictate that we don’t have to care too much about performance; refreshing our technology platform every 2-3 years safeguards we have the fastest environment”! We must draw our own conclusions as to whether such an observation is somewhat cursory and it’s probably not one shared by seasoned Mainframe professionals with several decades of experience. Where might such a notion come from?
Moore’s law is an observation that over the history of computing hardware, the number of transistors in a dense integrated circuit doubles approximately every two years or so. Some might say that this trend is slowing down as per this timeframe, namely 2014, where such capacity/performance increases will only happen every three to four years. This is somewhat arbitrary, but one interpretation of Moore’s law might be that refreshing technology every 2-3 years, safeguards from hardware defects and obsolescence, while leveraging from the latest performance and associated functionality. Such an interpretation makes sense from the initial observation of some Mainframe decision maker’s; but is it really that simple?
As always, the devil is in the detail and so life and indeed technology is never that simple. Observing that life is a collection of experiences, those Mainframe professionals with several decades or more experience, will remember the importance of “getting the maximum bang from their buck” and so I/O subsystem tuning was never an afterthought, it was mandatory. The cost of IBM Mainframe DASD storage and more importantly, CPU and memory resource dictated that optimal l/O tuning was the only way to accommodate a workload within a single IBM Mainframe Server footprint, or to complete the batch work in the overnight window, allowing the on-line day to process. An observation for today’s 21st Century zSeries decision maker, who probably does not have that experience to leverage from should be, “are zSeries I/O subsystem tuning skills still required”? The simple answer is yes; but why?
Nothing in the IT world remains static, change is inevitable and a good thing. Technology upgrades might mask performance problems, with faster z Series CPU chips, memory, FICON channels and DASD resources, but sometimes technology evolution can introduce the opportunity to improve performance, or introduce new anomalies.
Let’s just take a minute to remind ourselves of a fundamental Data Processing (DP) and Information Technology (IT) principle. The data must be moved from fixed storage via the Input/Output (I/O) subsystem to the CPU in order for processing. It therefore follows that badly designed or performing I/O will unnecessarily increase CPU usage, while slowing down associated data delivery. Conversely, a well-tuned I/O subsystem will preserve CPU resource for other more important work, optimizing processing times, while reducing costs. In a world where “doing more with less” is an objective for us all, wouldn’t a great notion be something like “an optimized I/O subsystem might delay a CPU upgrade by 1, 2 or 3 years”?
Each and every zSeries Mainframe environment has subsystems, files, application programs and JCL that must be adjusted and maintained regularly to minimize CPU resource, while optimizing data processing speed. From a business viewpoint, this means optimal consistent batch processing and on-line transactions, which safeguard Service Level Agreement (SLA) and Key Performance Indicator (KPI) metrics, while minimizing TCO.
All IT environments, including the zSeries Mainframe degrade when files grow in size, employees leave or retire (skills/experience are lost), new application programs are introduced, vendor software is upgraded and hardware refreshed. This is the reason we should have a proactive maintenance philosophy. However, maintenance is just not applying the latest software service or application code fixes; it’s also safeguarding that performance is maintained, for each and every interrelated I/O component. When the overall maintenance discipline is not deployed, performance for the zSeries Mainframe environment will inevitably degrade, generating potential business challenges.
CPU resource constraints might be a consequence, for processing tasks (I.E. Batch Jobs, On-Line Transactions, et al) in a reasonable and business facing time metric. Mission Critical on-line transaction response times and batch processing may slow, SLA’s might not be delivered, generating potential business challenges. Therefore accelerating the inevitable upgrade or technology refresh to a larger and faster computing platform will become a necessity, as opposed to a financially sensible and ideally timed asset upgrade. As Einstein once said “insanity is doing the same thing over and over again and expecting different results”! In this instance, technology refreshes or upgrades don’t resolve underlying I/O subsystem issues, they just hide them.
Over the last decade or so, IBM has continued to add increased function and associated performance capability to the zSeries I/O subsystem and related DFSMS product, for example:
- FICON Express8S: Incorporating an Application Specific Integrated Circuit (ASIC), designed to support 8 Gb/Second (Maximum Read & Writes – ~52,000 IOs/Second & ~770 MB/Second Throughput).
- zHPF: High Performance FICON for System z (zHPF) is a data transfer protocol, optionally deployed for accessing data via advanced disk storage (I.E. IBM DS8000, EMC VMAX, HDS VSP, et al). Combined with FICON Express8S (Maximum Read & Writes – ~92,000 IOs/Second & ~1600 MB/Second Throughput).
- System-Managed Buffering: VSAM can use System-Managed Buffering (SMB) to determine the number of buffers and the type of buffer management to use for VSAM data sets.
- Data Striping (SAM & VSAM): Allows sequential I/O to be performed for a data set at a rate greater than that allowed by the physical path between the DASD and the CPU (I.E. spreading the data set among multiple stripes on multiple DASD volumes and control units).
- MIDAW: The Modified Indirect Data Address Word (MIDAW) facility improves FICON performance, especially when accessing DB2 databases. MIDAW is an optimized I/O (CCW) function for gathering data into and scattering data from discontinuous storage locations during an I/O operation.
- Record Level Sharing (RLS): A function that allows VSAM data sets to be fully shared with data integrity among multiple users (E.g. CICS Regions) across multiple systems.
There are other zSeries I/O subsystems improvements not listed here, including compression (E.g. zSeries Host and zEDC) and availability (E.g. VSAM CA Reclaim), but as always, a little knowledge can be a dangerous thing! Although some of these features can be implemented transparently, or implicitly by doing nothing; explicit action is generally advisable, reviewing the current environment, performing before and after implementation benchmarks, sanity checking and measuring the impacts of our changes.
Returning full circle back to that initial observation of the business and financially orientated Mainframe decision maker; who or where is the seasoned z/OS technician that understands the end-to-end I/O subsystem? Of course, such a personnel resource, can deliver tremendous value. Even if said person just concentrates from a business viewpoint, optimizing I/O performance typically generating benefit with reduced processing time and lower operational (TCO) costs. Therefore, why wouldn’t we perpetually tune our I/O subsystems, primarily to safeguard our Mission Critical SLA and KPI metrics?
Generally there is no substitute for experience and this is certainly true of the Critical Path Software Inc. (CPSI) team, headed-up by Robert Burns, Jr. and Ralph E. Bertrum, who have delivered hundreds of successful zSeries I/O subsystem reviews, globally. It was forever thus, CPSI offer a suite of software products, code named Turbo, where the software product can do ~80% of the work, but only the final 20% of work performed by the experienced technician delivers any real benefit to the business. This is where the CPSI team can assist the zSeries Mainframe community, delivering services either in the form of training or professional services to safeguard that their customer derives “maximum bang from their buck”.
The CPSI portfolio incorporates all aspects of zSeries I/O subsystem tuning, for example:
- Turbo-Tune: System wide I/O subsystem analysis solution, performing numerous “what if” scenarios, comparing an extensive database of optimization parameters, to deliver the optimum settings on a file-by-file basis.
- Turbo-CICS: A CICS Contention Analyzer solution that leverages from the Turbo expert performance tuning database combined with real-life (your) customer CICS statistics to identify and eradicate workload spikes and associated slow response times in CICS (Non-DB2) regions.
- Turbo-DB2: A DB2 Contention Analyzer solution that leverages from the Turbo expert performance tuning database combined with real-life (your) customer CICS, DB2 and ICF Catalog statistics to identify and eradicate workload spikes and associated slow response times in CICS/DB2 regions.
- Turbo-VS: A managed service to deliver proactive VSAM settings for any zSeries Mainframe user, large or small. V-Source (VS) is a virtual dedicated VSAM expert, covering all of the technologies under the VSAM umbrella including, CICS, DB2, ICF Catalogs, IMS, CICS, JCL, System Files, et al. Optimal VSAM parameter orientated performance tuned libraries are delivered to the zSeries Mainframe user for testing and implementation.
In conclusion, for I/O subsystem tuning, whether deploying a software solution and related consulting professional or not, is not the underlying question, although leveraging from independent 3rd party experience, is generally a good thing, for obvious reasons. What is of fundamental importance is recognition that the zSeries I/O subsystem does not tune itself and is a vital component for maintaining optimal business performance and related TCO. Ignore this discipline at your peril and don’t rely on technology upgrade/refresh activities to maintain performance. If I/O performance is poor, no amount of leading-edge hardware or software will resolve this underlying issue. As always, zSeries I/O performance management requires expertise and experience, which can be supplemented with software solutions for automation and optimization benefits.
Sometimes there is no substitute for experience, where the bits and bytes experience of working from the 1980’s through to the present day, provides a thorough grounding in how the zSeries Mainframe I/O subsystem works. Software products can help provide data analysis, but only experience of working with many environments can turn this inordinate amount of performance data into meaningful information and associated action plans.