Simplifying Db2 for z/OS CPU Optimization: Eradicating Inefficient SQL Processing

Without doubt the IBM Z Mainframe server is recognised as the de facto choice for storing mission critical System of record (SOR) data in database repositories for 92 of the top 100 global banks, 23 of the 25 top global airlines; the top 10 global insurers & ~70% of all Fortune 500 companies. ~80% of mission critical data is hosted by IBM Z Mainframe servers, processing 30+ Billion transactions per day, including ~90% of all credit card transactions. This data is accessed by ~1.3 Million CICS transactions per second, compared with a Google (mostly search) processing rate of ~70,000 transactions per second. Interestingly enough, despite processing so many mission critical transactions the IBM Z Mainframe server platform is only accountable for ~6.2% of global IT spend. One must draw one’s own conclusions as to why some IT professionals perceive the IBM Z Mainframe server as being a legacy platform, not worthy of consideration as a strategic IT server platform…

The digital transformation has delivered an exponential growth of data, typically classified as Cloud, Mobile & Social based. This current & ever-growing data source requires intelligent analytics to deliver meaningful business decisions, requiring agile application software delivery to gain competitive edge. This digital approach can sometimes deliver a myriad of micro business application changes, personalised for each & every customer, often delivering “pop-up” applications…

IBM Z Mainframe software costs are often criticized as being a major barrier to maintaining or indeed commissioning the platform. IBM have tried to minimize these costs with numerous sub-capacity pricing options over the last 30 years or so, but this is perceived by many as being overly complicated; although with a modicum of knowledge, a specialized personnel resource can easily control software costs. All that said, IBM have introduced Tailored Fit Pricing for IBM Z, in an attempt to simplify software cost management. A recent blog reviewed the Tailored Fit Pricing for IBM Z offering & whether you decide whether this IBM Z pricing mechanism is suitable for your organization, optimizing IBM Z CPU MSU/MIPS usage is mandatory. Recognizing that the IBM Z Mainframe server is the de facto database server for System of Record data, primarily via the Db2 subsystem, clearly optimizing Db2 CPU usage, whether OLTP transactions, typically via CICS, or the batch window, has been & always will be, worthwhile…

All too often, many IT disciplines can be classified with a generic 80/20 rule & typically data can be classified accordingly, where 80% of data is accessed 20% of the time & 20% of data is accessed 80% of the time. The challenge with such a blunt Rule of Thumb (ROT) is that it’s static, but it’s a good starting point. Ideally for any large data source, there would be a dynamic sampling mechanism that would identify the most active data, loading this into the highest speed memory resource to reduce I/O access times & therefore CPU usage. Dynamic management of such a data buffer would render the 80/20 rule extraneous to requirements, as each & every business has their own data access profile. However, a simple cost benefit & therefore Proof of Value (POV) analysis could ensue.

From a Db2 viewpoint, pre-defined structures such as buffer pools offer some relief in storing highly referenced data in a high-speed server memory resource, but this has a finite capacity versus performance benefit, not necessarily using the fastest memory structures available nor dynamically caching the most accessed data. The business considerations of not optimizing Db2 data access are:

  • Elongated Batch Processing: With ever increasing amounts of data to process & greater demands for 247365 availability & real-time access, data access optimization is fundamental for optimized service delivery, often measured by mission critical SLA & KPI metrics. Optimized batch processing is a fundamental requirement for acceptable customer facing business service delivery.
  • Slow Transaction Response Times: As the nature of customer requirements change, mobile device applications exponentially increasing the number of daily transactions, overall system resource capacity constraints are often stressed during peak hours. Optimized transaction response time is a fundamental requirement, being the most transparent service delivered to each & every end customer.

An easy but very expensive solution to remediate batch processing & transaction response issues is to provide more resources via a CPU server upgrade activity. A more sensible approach is to optimize the currently deployed resources, safeguarding that frequently accessed data is mostly if not always high speed cache resident, reducing the I/O processing overhead, reducing CPU usage, which in turn will optimize batch processing & transaction response times, while controlling associated IBM Z Mainframe server hardware & software costs.

The ubiquitous Db2 data access method is Structured Query Language (SQL) based, where IBM has their own implementation, SQL for Db2 for z/OS, which could be via the commonly used COBOL (EXEC SQL) programming language or a Db2 Connect API (E.g. ADO.NET, CLI, Embedded SQL, JDBC, ODBC, OLE DB, Perl, PHP, pureQuery, Python, Ruby, SQLJ). For Db2 Connect, there are 2 types of embedded SQL processing, static & dynamic SQL. Static SQL minimizes execution time by processing in advance. Though some relief is provided by Dynamic Statement Cache, dynamic SQL is processed when the SQL statement is submitted to the IBM Z Db2 server. Dynamic SQL is more flexible, but potentially slower. The decision to use static or dynamic SQL is typically made by the application programmer. There is a danger that Dynamic Statement Cache might be considered as a panacea for SQL CPU performance optimization, but as per any other performance activity, reviewing any historical changes is a good idea. The realm of possibility exists for the Db2 Subject Matter Expert (SME) to be pleasantly surprised that more often than not, there are still significant SQL CPU optimization opportunities…

From a generic Db2 viewpoint, with static SQL, you cannot change the form of SQL statements unless you make changes to the program. However, you can increase the flexibility of static statements by using host variables. Obviously, application program changes are not always desirable.

Dynamic SQL provides flexibility, if an application program needs to process many data types & structures, dictating that the program cannot define a model for each one, dynamic SQL overcomes this challenge. Dynamic SQL processing is facilitated by Query Management Facility (QMF), SQL Processing Using File Input (SPUFI) or the UNIX Systems Services (USS) Command Line Processor (CLP). Not all SQL statements are supported when using dynamic SQL. A Db2 application program that processes dynamic SQL accepts as input, or generates, an SQL statement in the form of a character string. Programming is simplified when you can structure programs not to use SELECT statements, or to use only those that return a known number of values of known types.

For Db2 data access, SQL statement processing requires an access path. The major SQL statement performance factors to consider are the amount of time that Db2 uses to determine the access path at run time & whether the access path is efficient. Db2 determines the SQL statement access path either when you bind the plan or package that contains the SQL statement or when the SQL statement executes. The repeating cost of preparing a dynamic SQL statement can make the performance worse when compared with static SQL statements. However, if you execute the same SQL statement often, using the dynamic SQL statement cache decreases the number of times dynamic statements must be prepared.

Typically, organizations have embraced static SQL over dynamic because static is more predictable, showing little or no change, while dynamic implies ever changing & unpredictable. Db2 performance optimization functions have been incorporated into base Db2 (E.g. Buffer Pools) & software products (E.g. IBM Db2 AI for z/OS, IBM Db2 for z/OS Optimizer, IBM Db2 Analytics Accelerator, IBM Z Table Accelerator, IZTA), with varying levels of benefit & cost. Ultimately IBM Z Mainframe customers need simple cost-efficient off-the-shelf solutions of a plug & play variety & without doubt, optimizing static SQL data processing is a pragmatic option for reducing Db2 subsystem CPU usage.

In Db2 Version 10, support for 64-bit run time was introduced, providing Virtual Storage Constraint Relief (VSCR), improving the vertical scalability of Db2 subsystems. With Db2 Version 11, the key z/Architecture benefit of 64-bit virtual addressing support was finally introduced, increasing capacity of central memory & virtual address spaces from 2 GB to 16 EB (Exabytes), eliminating most storage constraints. It therefore follows that any Db2 CPU performance optimization solution should also exploit the z/Architecture 64-bit feature, to support the ever-increasing data storage requirements of today’s digital workloads.

As we have identified, Db2 can consume significant amounts of z/OS CPU accessing & retrieving the same static frequently used data elements repetitively. Upon analysis, these static frequently used data elements are typically identified originating from a small percentage of Db2 tablespaces. Typically, at first glance these simple SQL programs are considered as low risk, but are repeatedly processed, often in peak processing times, consuming excessive CPU & increasing processing cost accordingly, typically z/OS Monthly Licence Charges (MLC) related. Db2 optimization tools for access path or buffer pool management provide some benefit, but this is not always significant & may require application changes. Patently there is a clear & present requirement for a simple plug & play solution, transparent to Db2 processing, maintaining an optimized high-performance in-memory cache of frequently used Db2 data, safeguarding data integrity in environments various, including SYSPLEX, Data Sharing, et al…

QuickSelect is a plug-in solution dynamically activated in a batch or OLTP environment (I.E. CICS, IMS/TM) intercepting repetitive SQL statements from Db2 application programs, storing the most active result set, not necessarily the entire tablespace, in a high-performance in-memory cache, returning to applications the same result set as per Db2, but much faster & using less CPU accordingly. QuickSelect is completely transparent to z/OS applications, eliminating any requirement to change/recompile/relink application source or rebind packages. QuickSelect processing can be switched on or off using a single keystroke, either defaulting to standard Db2 SQL processing or to benefit from the QuickSelect high-speed cache for optimized CPU resource usage.

The 64-bit QuickSelect server, implemented as a started task, intelligently caching data in self-managed memory above the bar, supporting up to 16 EB of memory, eliminating concerns of using any other commonly used storage areas (E.g. ECSA). The intelligent caching mechanism safeguards that only highly active data is retained, optimizing the associated cache memory size required.

QuickSelect caches frequently requested Db2 SQL result sets, returning these results to the application from QuickSelect cache, when a repetition of the same SQL is encountered. For data integrity purposes, QuickSelect immediately invalidates result sets upon detection of changes to underlying tables, implicitly validating each cache resident SQL result set. Changes to Db2 data by application programs are captured by a standard Db2 VALIDPROC process, attached to the typically small subset of frequently accessed tables of interest to QuickSelect. Db2 automatically activates the VALIDPROC routine whenever the table contents are changed by INSERT, DELETE, UPDATE or TRUNCATE statements, invalidating cached data from the updated tables automatically. For standard Db2 utilities such as LOAD/REPLACE, REORG/DISCARD & RECOVER, table-level changes are identified by a QuickSelect utility-trap, invalidating cached data from the updated tables automatically. QuickSelect also supports SYSPLEX & Data Sharing environments, supporting update activity via the same XCF functions & processes used by Db2.

QuickSelect delivers the following benefits:

  • CPU Savings: Meaningful reduction (E.g. 20%) in the Db2 SQL direct processing; 10%+ peak time CPU reduction is not uncommon.
  • Faster Processing: Optimized CPU usage delivers shorter batch processing & OLTP transaction response times, for related SLA & KPI objective compliance.
  • Transparent Implementation: No application changes required, source code, load module or Db2 package.
  • Survey Mode: Unobtrusive & minimal Db2 workload overhead data sampling to identify potential CPU savings from repetitive SQL & tables of interest, before implementation.
  • Staggered Deployment: Granular criteria (E.g. Job, Program, Table, Transaction, Etc.) implementation ability.
  • Reporting & Analytics: Extensive information detailing cache usage for Db2 programs & tables.

Since 1993 Db2 has evolved dramatically, in line with the evolution of the IBM Z Mainframe server. When considering today’s requirement for a digital world, processing ever increasing amounts of mission critical data, a base requirement to optimize CPU processing for Db2 SQL data access is mandatory. In a hybrid support environment where today’s IBM Z Mainframe support resource requires an even blend of technical & business skills, plug & play, easy-to-use & results driven solutions are required to optimize CPU usage, transparent to the subsystem & related application programs. QuickSelect is such a solution, fully exploiting 64-bit z/Architecture for ultimate scalability, identifying & resolving a common CPU consuming data access problem, for a mission critical resource, namely the Db2 subsystem, maintaining mission-critical System of Record data.

z/OS CPU optimization is a mandatory requirement for every organization, to reduce associated software & hardware costs & in theory, as a mandatory pre requisite for deploying the Tailored Fit Pricing for IBM Z pricing mechanism. Tailored Fit Pricing uses the previous 12 Months SCRT submissions to establish a baseline for MSU charging over a contracted period, typically 3 years. If there are any unused MSU resources, these are carried forward to the next year, but if those MSU resources remain unused at the end of the contracted period, they are lost, meaning the organization has paid too much. If the MSU resource exceeds the agreed Tailored Fit Pricing, excess MSU resources are charged at a discounted rate. Clearly achieving an optimal MSU baseline before embarking on a Tailored Fit Pricing contract is arguably mandatory & it therefore follows that optimizing CPU forever more, safeguards optimal z/OS MLC charging during the Tailored Fit Pricing contract. QuickSelect for Db2 is a seamless CPU optimization product that will perpetually deliver benefit, assisting organizations minimize their z/OS MLC costs, whether they continue to proactively manage the R4HA, submitting monthly SCRT reports or they embark on a Tailored Fit Pricing contract…

Java: Is System z A Viable Server Platform?

As long ago as 1997, IBM integrated Java into their IBM Mainframe platform, in those days via the then flagship OS/390 Operating System. As with any new technology, perhaps the initial OS/390 Java integration offerings were not perfect, but some ~20 years later, a lot has changed…

In 2000, IBM Java SDK 1.3.1 delivered z/OS and Linux on z support, quickly followed by 64-bit z/OS support in 2003 via SDK 1.4. In 2004 Java Virtual Machine (JVM) and JIT (Just-In-Time) compiler technology support was provided, while Java code has always exploited IBM specialty engines, primarily zAAP initially and now via zIIP and the zAAP on zIIP capability. Put simply, IBM continues to invest aggressively in Java for System z, demonstrating a history of innovation and performance improvements, up to and including the latest z13 server.

So why should a 21st century business consider the System z platform for Java workloads?

Arguably the primary reason is a rapidly emerging requirement for the true 24*7*365 workload, which cannot accommodate a batch window, where Java is ideally placed to serve both batch and OLTP workloads. Put another way, the need to process batch work has not gone away, whereas a requirement to process batch work concurrently with OLTP services has emerged. Of course, traditionally the typical System z enterprise might have two sets of IT staff for OLTP and batch workloads, typically in the IT Support and Application Management teams, whereas via Java and a workload centric approach, separate batch and OLTP support personnel are not necessarily required.

For the System z platform, Java support has always been incorporated into the core architectural building blocks, namely z/OS, CICS, DB2, IMS, WebSphere, Batch Runtime, et al. Therefore there are no functional reasons why new applications or indeed existing applications cannot be engineered using the pervasive Java programming language and deployed on the System z platform.

Quite simply, Java is a critically important language for IBM System z. Java has become foundational for data serving and transaction serving, the traditional strengths of IBM System z. WebSphere applications written in Java and processing via System z, benefit from a key advantage through co-location. This delivers better response times, greater throughput and reduced system complexity when driving CICS, DB2 and IMS transactions.

Java is also critical for enabling next generation workloads in the IBM defined Cloud, Analytics, Mobile & Security (CAMS) framework. Cloud and mobile applications can access z/OS data and transactions via z/OS Connect and other WebSphere solutions, all inherently Java based. Java on System z also provides a full set of cryptographic functions to implement secure solutions. A key strength of Java applications is the ability to immediately benefit from the latest hardware performance improvements using the Just In-Time (JIT) compiler incorporated in the latest IBM Java SDK releases.

Let’s not forget, there are many other good reasons why Java might be considered as a viable application programming language:

  • Personnel Skills Availability: Java is typically ranked in the top 3 of most widely used programming languages; therefore personnel availability is abundant and cost efficient.
  • Application Code Portability: Recognizing Java bytecode and associated JVM functionality, no matter what the platform (E.g. Wintel, X86 Linux, UNIX, z/OS, Linux on System z, et al), the Java application code should process without consideration.
  • Application Tooling Support: Application Development tools have evolved to the point of true platform independence, where Application Programmers just create their code, they don’t necessarily know or sometimes care, where that code will execute. Let’s not forget the simplification of Java code for OLTP and batch workloads, reducing associated IT lifecycle support costs.
  • TCO Efficiencies: Simplified Application Development and deployment reduces associated cost, while reducing implementation time for mission-critical workloads. Java exploitation of the zAAP (zAAP on zIIP) safeguards low software costs and optimized processing times (I.E. Sub-Capacity specialty engines run at full speed).

With the announcement of the zEC12 server, notable Java enhancements included:

  • Hardware Transaction Memory (HTM) – Better concurrency for multi-threaded applications
  • Run-Time Instrumentation (RI) – Innovation of a new hardware facility designed for managed runtimes
  • 2 GB Page Frames – Improved performance targeting 64-bit heaps
  • Pageable 1 MB Large Pages (Flash Express) – Better versatility of managing memory
  • New Software Hints/Directives – Data usage intent improves cache management; Branch pre-load improves branch prediction
  • New Trap Instructions – Reduce implicit bounds/null checks overhead

In summary, System z users can expect up to 60% throughput performance improvement amongst Java workloads measured with zEC12 and the IBM Java 7 SR3 SDK.

IBM z13 and the IBM Java 8 SDK deliver improved Java performance, including Single Instruction Multiple Data (SIMD) vector engine, Simultaneous Multi-Threading (SMT) and improved CP Assist for Cryptographic Function (CPACF). Delivering up to 2X improvement in throughput-per-core for security-enabled applications and up to 50% improvement for other generic applications.

Other z13 Java functional and performance improvements include:

  • Secure Application Serving – Application serving with Secure Socket Layers (SSL) will exploit the new Java 8 Clear Key CPACF and SIMD vector instructions for string manipulation. An additional 75% performance improvement for Java 8 on z13 with SMT versus Java 8 on zEC12.
  • Business Rules Processing – Business rules processing with Java 8 takes advantage of the SIMD vector instructions and SMT for zIIP specialty engines on z13 to achieve significant improvements in throughput-per-core. An additional 37% performance improvement from z13 SMT zIIPs with Java 8 versus Java 8 on zEC12.
  • Specific z/OS Java 8 Exploitation of z13 SIMD – Java 8 exploits the new z13 SIMD vector hardware instructions for Java libraries and functions. These SIMD vector hardware instructions on z13 for improved performance, where specific idioms/operations were improved by between 2X and 60X. Performance benefits for real life Java applications will be dependent on how frequently these idiom/operations are used.

In conclusion, the IBM commitment to Java on System z is clearly evident and the cost, performance and security proposition becomes compelling on the latest zEC12 and z13 Mainframe servers. The pervasive deployment of Java as a universal IT programming language dictates that programmer availability will never be an issue, and platform independence dictates that Java applications can be created and processed on any platform. Let’s not forget, the strong single thread performance and I/O scalability of System z as a significant differentiator when comparing Java performance on any IT platform.

Moreover, as always, perhaps the business dictates what platform is the most suitable for business applications. The evolution to a combined OLTP and batch workload for the 21st Century 24*7*365 mission critical business application, ideally places Java as an eminently viable programming language. Therefore there is no requirement to reengineer any existing System z application, or to find an alternative platform for new business functions. As always, the System z Mainframe platform should never be overlooked…

COBOL – A Viable Programming Language?

For the last twenty years or so I have encountered many scenarios where Mainframe users consider migration to a Distributed Systems (E.g. Wintel, UNIX/Linux, et al) platform, where more often than not the primary reasons seems to be “green screen” and/or “COBOL is a declining legacy language” based.

Going back to basics, COBOL is a Common Business Oriented Language, although the naysayers might say COBOL is a Completely Obsolete Business Oriented Language; we will perhaps try to be more dispassionate in this discussion…

Industry Analysts have stated that there are ~220 Billion lines of COBOL code and ~100,000 programmers and that COBOL applications process ~80% of business transactions daily, and that there are ~200 times more COBOL transactions processed daily, when compared with Google searches!  A lot of numbers and statistics, but seemingly COBOL is still widely used and accepted.  Even from a new development viewpoint, ~5 Billion lines of COBOL code per annum (~15% of Annual Global Development) is stated, suggesting that COBOL is not in any way obsolete or legacy, so why is COBOL perceived by some in a dubious manner?

Maybe because COBOL was introduced in 1959 and primarily it is deployed on the Mainframe, and so anything that is 50+ years old and has an association with the Mainframe just has to be dubious, doesn’t it?  Of course not, as this arguably “pioneering” or at least one of the first “widely deployed” programming languages allowed many global and significant businesses grow, in tandem with the IBM Mainframe platform, automating and streamlining business processes, increasing productivity and so on.  So depending on your viewpoint, COBOL was either in the right place at the right time, stimulating the Data Processing (DP) and Information Technology (IT) revolution, or COBOL just got lucky, it was “Hobson’s Choice”…

Although there have been several iterations of COBOL standards (I.E. COBOL-68, COBOL-74, COBOL-85), primarily associated with the American National Standards Institute (ANSI) and more latterly COBOL 2002 (ISO), a COBOL program that was written and compiled on an IBM Mainframe several decades ago, will most likely still run on the latest generation IBM Mainframe.  Put another way, its backwards compatibility ability has been significant, and although there were some migration considerations associated with the Language Environment (LE), the original COBOL Application Development investment has generated a readily usable Return On Investment (ROI) over and over again.  How true is this for other programming languages and computing platforms?  For the avoidance of doubt, a COBOL program that was written in 16-bit, can still run today on a 64-bit platform, and with a modicum of evolution, fully exploit the latest functionality and 64-bit performance, with minimal fuss.  While how many revolutionary or significant upgrades have been required for Commercial Off The Shelf (COTS) software and associated bespoke application development code, to upgrade non-Mainframe platforms from 16-32-64-bit?

So, is COBOL a viable programming language of the future?  One must draw one’s own conclusions, but we can look to recent functional enhancements and statements of direction from an IBM Mainframe viewpoint.

In recent years IBM have actually increased the number of COBOL R&D personnel by a factor of ~100%, while increasing allocated investment, commitment and interest accordingly.  This observation more than any other, suggests that at least from an IBM Mainframe viewpoint, COBOL is an important function.

From a technical function viewpoint, the realm of possibility exists with COBOL, interacting with all 21st century programming and function techniques, dismissing the notion that COBOL can only be considered as a traditional/legacy option for CICS-Batch applications and associated “green screen” environments, for example:

  • Support for CICS integrated translator
  • Support for latest SQL data types in syntax via DB2
  • Support for Java interoperability via object-oriented COBOL syntax
  • Support access for WebSphere enterprise beans
  • Support for Java SDK
  • Support for XML high speed parsing and validation (UTF-8, UTF-16 & various EBCDIC codepages)

From a strategic statement of direction viewpoint, IBM have declared the following major notable activities:

  • Performance and resource utilization optimization, reducing TCO accordingly
  • Improved middleware (I.E. CICS, DB2, IMS, WebSphere) programmability and problem determination
  • Improved capabilities (E.g. XML, Java, et al) for modernizing & creating business critical applications
  • Improved programmer (E.g. Usability and Problem Determination) productivity
  • Source and load (I.E. recompile not required) compatibility (E.g. old programs can call new and vice versa)

Even for those occasions where the IBM Mainframe platform might be decommissioned, COBOL can still be processed on alternative platforms via code migration techniques such as Micro Focus, where such functions and services can be Cloud based.  However, once again, isn’t the IBM Mainframe the ultimate “Cloud” platform, which has arguably been the case “forever thus”?

One must draw one’s own conclusions as to why the Mainframe platform and/or COBOL applications are often considered for replacement via migration, when the Mainframe platform is both strategic and cost efficient.  As with any technology decision, there is no “one size fits all” solution, but perhaps a little education can go a long way, and at least the acceptance that seeming “legacy” technologies are strategic and viable.