System z Batch Optimization: Another Pipes Option?

Over the last 20 years or so I have encountered many sites looking for solutions to streamline their batch processing, only to find that sometimes they are their own worst enemy, because their cautious Change Management approach means they will not change or even recompile COBOL application source, unless absolutely forced to do so.  Sometimes VSAM file tuning is the answer, sometimes identifying the batch critical path, and on occasion, finding that key file or database that is processed on several or more occasions, which might benefit from parallelism is the answer.

BatchPipes was first introduced with MVS/ESA, allowing for data (E.g. BSAM, QSAM) to be piped between several jobs, allowing concurrent job processing, reducing the combined elapsed time of the associated job stream.  BatchPipes maintains a queue of records that are passed between a writer and reader.  The writer adds records to the back of a pipe queue and the reader processes them from the front.  This record level processing approach avoids any potential data set serialization issues when attempting to concurrently write and read records from the same physical data set.

The IBM BatchPipes feature has evolved somewhat and BMC have offered similar functionality with their initial Data Accelerator and Batch Accelerator offering, subsequently superseded by MainView Batch Optimizer Job Optimizer Pipes.  It seems patently obvious that to derive the parallelism benefit offered by BatchPipes, the reader and writer jobs need to be processed together.  For many, such a consideration has been an issue that has eliminated any notion of BatchPipes implementation.  Other considerations include a job failure in the BatchPipes process, where restart and recovery might include several jobs, as opposed to one.  Therefore widespread usage of BatchPipes has been seemingly limited.

The first step for any BatchPipes consideration is identifying whether there is any benefit.  IBM provide a BatchPipes SMF analysis tool to determine the estimated time savings and benefits that can be achieved with BatchPipes.  This tool reads SMF record types 14, 15 and 30 (Subtypes 1, 4 and 5) to analyse data set read and write activity, reconciling with the associated processing job.  As an observation, sometimes a data source might have a different data set name, be both permanent and temporary, while consuming significant I/O and CPU resource for processing.  Such data source reconciliation can easily be achieved, as the record and associated I/O count for such a data source is the same, for entire data set processing operations.  The analysis tool will identify the heavy I/O jobs and be a great starting point for any analysis activities.

UNIX users will be very familiar with the concept of pipes, where a UNIX pipeline is a sequence of processes chained together by their standard streams, where the output of each process (stdout) feeds directly as input (stdin) to the next one.  Wouldn’t it be good if there was a hybrid approach to BatchPipes, using a combination of standard z/OS and extended UNIX Systems Services (USS)?

With z/OS 2.2, JES2 introduced new functions to facilitate the scheduling of dependent batch jobs.  These functions comprise Job Execution Control (JEC) and can be utilized by making use of the new JOBGROUP and related Job Control Language (JCL) statements.  The primary goal of JEC is to provide an easy-to-use control mechanism, allowing complex batch jobs to be processed in inter-related constituent pieces.  Presuming that these constituent pieces can be run in parallel, improved throughput can be achieved by exploiting the concurrency functions provided by JEC.

UNIX named pipes can be used to pass data between simultaneously executing jobs, where the UNIX pipe can either be temporary or permanent.  One or more processes can connect to a UNIX named pipe, write to it, and read from it, as and when required.  Unlike most types of z/OS UNIX files, data written to a named pipe is always appended to existing data rather than replacing existing data.  Therefore, the STOR command is equivalent to the APPE command when UNIXFILETYPE=FIFO is configured.  This UNIX pipe facility, managed by the JES2 JEC functions can be leveraged to provide benefit for multiple step job processing and concurrent job processing, with the overall benefit of a reduction in overall batch stream elapsed time.

In conclusion, the advancement in JES2 JEC processing simplifies the batch scheduling and restart configuration processing, while the usage of UNIX named pipes leverages from existing z/OS USS functionality, safeguarding good performance using a tried and tested process.

Finally, returning full circle to my initial observation of Change Management considerations when performing batch optimization initiatives; recently I worked with a customer I visited in 2001, where they considered and dismissed BatchPipes Version 2.  We piloted this new UNIX pipe facility in Q4 2016, in readiness for their Year End processing, where they finally delivered a much needed ~2 Hour reduction in their ~9 Hour Critical Path Year End batch process.  Sometimes patience is a virtue, assisted by a slight implementation tweak…