IBM Z Mainframe VTL End Of Support (EOS): A Problem Or Opportunity?

For ~20 years, since 1996 when IBM announced their IBM TotalStorage Virtual Tape Server Model B16 (3494-B16), typically known as the VTS, followed by the StorageTek (Oracle) Virtual Storage Manager (VSM) in 1998, there has been evident IBM Mainframe VTL innovation and product line refreshes, offering a granularity of offerings for all users, regardless of size.  The consolidation of the IBM Mainframe VTL marketplace in the ~2017-2019 period is notable.  IBM have consolidated their options to the high-end TS7760, retiring their TS7720 and TS7740 models. Similarly, Oracle have also delivered significant performance and enhancements to their VSM offering, where the latest VSM 7 delivers significant resource when compared with the VSM 6 and older predecessors (NB. The VSM 6 platform replaced the proprietary VSM 5 platform with Sun servers & Sun JBOD disk storage).  Similarly, EMC have consolidated their DLm offerings to the DLm8500, retiring their DLm1000, DLm1020, DLm2000, DLm2100, DLm6000 and DLm8000 models.

A high-level review of the mainstream market place offerings, namely EMC DLm8500, IBM TS7760 and Oracle VSM 7 demonstrates Enterprise Class VTL solutions, delivering significant availability, capacity and performance capabilities, mandatory for the higher echelons of IBM Z Mainframe user.  Conversely, it follows that such attributes and associated cost become somewhat of a concern for the small to medium sized IBM Mainframe user.  When any product becomes End Of Support (EOS), End Of Life (EOL) or even End Of Marketing (EOM), the viability and associated TCO becomes a consideration.  Typically, there are several options to address such an issue:

  • Do nothing (because we’re decommissioning the IBM Mainframe sometime soon)
  • Secure a long-term support contract (E-g. 3-5 years) ASAP, to reduce increasing support costs
  • Perform a technology refresh to the latest supported supplier offering
  • Review the marketplace and migrate to a more suitable supported solution

Only the incumbent IBM Mainframe VTL user can decide the best course action for their organization, but from a dispassionate viewpoint, reviewing these respective options generates the following observations:

  • Do nothing: The cost of doing nothing is always expensive. The perpetual “we’re moving away from the IBM Mainframe in the next 3-5 years” might have been on many “to-do” lists, for decades”!  The IBM Mainframe platform is strategic!
  • Long-term support contract: This delays the inevitable and potentially generates data availability challenges, as the equipment ages and potentially becomes more unreliable, with limited or expensive OEM support.
  • Technology refresh: In theory, the best option, upgrading the incumbent technology to the latest offering. In this instance, the cost might be significant for the small to medium sized user, as EMC, IBM and Oracle no longer offer “entry to medium-sized” solutions.
  • Migrate: By definition migration is perceived as introducing risk, migrating from a tried and tested to a new solution. However, generally the best products come from suppliers with a focus on their flagship solution, as opposed to a large company, with many offerings…

The IBM Mainframe VTL marketplace does include other suppliers, including FUJITSU, LUMINEX, Visara, naming but a few, and one must draw one’s own conclusions as to their respective merits.  What is always good is a new marketplace entrant, with a credible offering, a different approach or demonstrable expertise.

Optica Technologies is a privately held technology company headquartered in Louisville, Colorado, USA. Optica have been providing high-quality data centre infrastructure solutions since 1967. Optica has been an IBM strategic partner since 2002 and has received the most extensive IBM qualification available for third party solutions. Optica products have been successfully deployed in many major enterprise data centres worldwide.

The Optica Prizm FICON to ESCON Protocol Converter designed to enable IBM mainframe customers to invest in the latest System Z platforms (I.E. zEC12/zBC12 upwards), while preserving the ability to connect to critical ESCON and Bus/Tag device types that remain.

The next generation zVT Virtual Tape Node (VTN) exploits the latest Intel server technology, delivering outstanding performance, resiliency and scalability to serve a broad range of IBM Z customers. Each zVT VTN is modular and packaged efficiently with (2) FICON channels in an industry standard 2U rack format. The zVT VTN supports up to 512 3490/3590 Virtual Tape Drive (VTD) resources, delivering ~500 MB/S performance for the typical IBM Mainframe tape workload. As per some of the architectural design characteristics of the IBM Z Mainframe server (I.E. z13, z14), the zVT VTN server is enabled for operation in warmer environments than traditional data centres and engineered for extreme conditions such as high humidity, earthquakes and dust. To support the diversity of IBM Z Mainframe customer environments, from the smallest to largest, the flexible zVT solution is available in three different formats:

  • zVT 3000i: for IBM Mainframe users with more limited requirements, the fully integrated zVT 3000i model leverages the same Enterprise Class zVT VTN, incorporating 16 Virtual Tape Drive (VTD) resources and 8 TB of RAID-6 disk capacity, delivering 20 TB of effective capacity via the onboard hardware compression card (2.5:1 compression). The fundamental cost attributes of the zVT 3000i make a very compelling argument for those customers on a strict budget, requiring an Enterprise Class IBM Mainframe storage solution.
  • zVT 5000-iNAS: the flagship zVT 5000-iNAS solution is available in a fully redundant, high availability (HA) base configuration that combines (2) VTNs and (2) Intelligent Storage Nodes (ISNs). The entry-level zVT 5000-iNAS HA offering incorporates 512 (256 per VTN) Virtual Tape Drive (VTD) resources, delivering ~1 GB/Sec performance, 144 TB RAW and ~288 TB of effective capacity using a conservative 4:1 data reduction metric. zVT 5000-iNAS can scale to a performance rating of ~4 GB/Sec and capacity in excess of 11 PB RAW.
  • zVT 5000-FLEX: For IBM Mainframe users wishing to leverage their investments in IP (NFS) or FC (SAN) disk arrays, the zVT 5000-FLEX offering can be configured with (2) 10 GbE (1 GbE option) or (2) 8 Gbps Fibre Channel ports. Virtual Tape Drive (VTD) flexibility is provided with VTD options of 16, 64 or 256, while onboard hardware compression safeguards optimized data reduction.  Enterprise wide DR is simplified, as incumbent Time Zero (E.g. Flashcopy, Snapshot, et al) functions can be utilized for IBM Mainframe tape data.

In summary, Optica zVT reduces the IBM Mainframe VTL technology migration risk, when considering the following observations:

  • Technical Support: With 50+ years IBM Mainframe I/O connectivity experience, Optica have refined their diagnostics collection and processing activities, safeguarding rapid problem escalation and rectification, with Level 1-3 experts, located in the same geographical location.
  • Total Cost of Acquisition (TCA): zVT is a granular, modular and scalable solution, with a predictable, optimized and granular cost metric, for the smallest to largest of IBM Mainframe user, regardless of IBM Z Operating System.
  • Total Cost of Ownership (TCO): Leveraging from the latest software and hardware technologies and their own streamlined support processes, Optica deliver world class cradle-to-grave support for an optimized on-going cost.
  • Flexibility: Choose from an all-in-one solution for the smallest of users (I.E. zVT 3000i), a turnkey high-availability solution for simplified optimized usage (I.E. zVT 5000-iNAS) and the ability to leverage from in-house disk storage resources (I.E. zVT 5000-FLEX).
  • Simplified Migration: A structured approach to data migration, simplifying the transition from the incumbent VTL solution to zVT. zVT also utilizes the standard AWSTAPE file format, meaning data migration from zVT is simple, unlike the proprietary AWS file formats used by other VTL offerings.

In conclusion sometimes End Of Support (EOS) presents an opportunity to review the incumbent solution and consider a viable alternative and in the case of an IBM Mainframe VTL, for the small to medium sized user especially, having a viable target option, might just allow an organization to maintain, if not improve their current IBM Mainframe VTL expenditure profile…

Mainframe Virtual Tape: Tape On Disk; But For How Long?

By definition, a Virtual Tape Library (VTL) solution uses a disk cache to store tape data files, but for how long is this data retained on disk? Is it minutes, hours, days, weeks or indefinitely? Only business requirements can dictate the time period tape data is stored on disk, which will influence the VTL solution chosen. We will return to this pivotal question later in the article…

Some might say (for some reason I’m thinking of an Oasis lyric) that Mainframe Virtual Tape choice is as simple as black and white; or blue (IBM) and red (Oracle AKA StorageTek). Hmmm, clearly this is not the case; there are grey areas, but moreover, there are many colours to choose from. For sure we must recognize the innovation in tape technologies by StorageTek, delivering the 1st Automated Tape Library (ATL, NearLine) and IBM with the first Virtual Tape Library (VTL, VTS), naming but a few. Of course, now I recall, IBM delivered VTS in the mid-1990’s, about the same time as that Oasis song!

There is also that age old debate as to whether tape is dead or not and the best compromise always seems to be, “we’ll have to agree to disagree”, depending upon your viewpoint. Does it matter?

I also recall the early 1990’s, where Mainframe disk was proprietary and based upon 1:1 mapping, a physical disk was the addressable DASD volume. The promise of Iceberg (AKA SVA) from StorageTek and the delivery of Symmetrix by EMC changed this status quo, and so the Mainframe world adopted logical to physical mapping for disk storage, via RAID technologies, with Just a Bunch Of Disks (JBOD). This was significant, as the acquisition cost per MB for Mainframe disk was ~£5 (yes that’s right, I’m a Brit, so GBP), and today, maybe ~£0.01 (1 Penny) per MB, or ~£10 per GB, and getting lower each year. So yes, tape is always less expensive when compared with disk, by significant magnitudes, but the affordability of disk indicates that it can now be seriously considered, for backup and archive data.

As with any technology decision, it should be business requirements that drive the solution chosen, and not an allegiance to a storage media type, tape or disk, or a long time Mainframe tape vendor, IBM or Oracle. Ultimately there is only one thing that differentiates one business from another, and that is the data itself, stored in whatever format, databases, application code libraries, batch flat files, et al. Therefore the cost of storage is somewhat arbitrary; it’s the value of the business data that we should consider, while recognizing capital expenditure and TCO running costs.

The 21st century business seemingly requires near 24*7 service availability and if that business deploys a zSeries (~zero downtime) Mainframe server, I guess we can presume that said business requires near 24*7 data availability. We then must consider Business Continuity and associated Disaster Recovery metrics, which are measured by the Recovery Time Objective (RTO) and Recovery Point Objective (RPO). Ultimately these RTO and RPO values will dictate the required Backup & Recovery and Archive solutions required, where Recovery (time) is the most important factor!

When was the last time you performed a completely successful Disaster Recovery test from a secondary (physical tape, virtual tape disk) copy of data and was the Recovery Time Objective (RTO) satisfied? Was this a complete workload test, where you included on-line, batch and backup (VTL) testing?

From a data categorization viewpoint, industry analysts tell us, if we didn’t know this fact ourselves, that the majority of Mission Critical data is stored in database structures. If we associate other data types with said databases, application code to process the data, policies to manage and safeguard the data and processes to secure and preserve the data, then I guess we have many instances of Mission Critical data.

As the cost of disk has reduced, so has the cost of network bandwidth, so it’s not uncommon for Mainframe customers to mirror/replicate their data between Geographically Dispersed (E.g. GDPS, GDDR) data centres. They deploy this significant investment solution because they have a requirement for near 24*7 service and thus data availability. Therefore their RTO is likely measured in Minutes (E.g. ~5-15), not because the underlying technology can’t deliver a near instantaneous switch, but because the data needs a Point of Consistency (PoC), and this is the “latency time” for delivering a meaningful RPO (E.g. Pre Batch, Post Batch). Mission Critical databases need to establish a Quiesce PoC, to safeguard data consistency.

If the Mainframe user implements this high availability solution for their primary data copy, why wouldn’t they do this for their secondary (E.g. Backup, Archive) data copy? Ultimately there is generally a hierarchy of RTO and RPO objectives, associated with physical and logical failures. A mirrored disk environment only provides rapid recovery (RTO) for a physical component failure, while a logical data failure will manifest itself for all data copies in the mirror topology. Therefore we always have to consider what is our last line of defence for data recovery; typically a secondary backup data copy. Clearly recovering data from a backup, even a disk based backup, generates a significantly higher recovery (RTO) elapsed time. We might also consider data consistency for this backup data copy; namely, has the backup data been completely destaged/written to the target storage device, tape or disk? Of course, if we don’t have a good backup, we can’t recover the data!

OK, we have come full circle to that original question, by definition, a Virtual Tape Library (VTL) solution uses a disk cache to store tape data files, but how long is this data retained on disk? Is it minutes, hours, days, weeks or indefinitely? Only business requirements can dictate the time period tape data is stored on disk, which will influence the VTL solution chosen.

VTL solutions can be classified as either traditional or tapeless. Traditional is a combination of physical drives and cartridge media in an ATL with a Virtual Tape disk cache (usually proprietary) that is destaged periodically to physical cartridge media, where the primary suppliers are of course IBM with their TS7700 family and Oracle with their VSM offering, while Fujitsu have their CentricStor offering. Tapeless VTL solutions are typically FICON/ESCON channel attached appliances to a back-end disk cache (typically IP, FC or iSCSI), where the tape data is permanently stored on disk. Because the back-end disk cache can be any disk subsystem, within reason, the disk acquisition cost is optimized, because it’s classified as Enterprise/Distributed disk, as opposed to Mainframe disk.

There are many suppliers of tapeless VTL solutions, but the primary vendors are EMC with their Disk Library for Mainframe (DLm) offering and HDS with a several layered approach including LUMINEX Gateways and HDS disk. EMC recently acquired Bus-Tech, where DLm is an OEM of the Bus-Tech MDL solution, still available via the EMC Select option. IBM, Oracle and Fujitsu also offer tapeless VTL solutions, as and if required, but generally they’re deployed in combination with their traditional physical tape based VTL/ATL offerings. There are also software options, IBM Virtual Tape Facility for Mainframe (VTFM) and CA Vtape, where these software solutions deploy higher cost Mainframe disk as the virtual tape cache.

The majority of VTL solutions benefit from data dedupe functionality, where IBM incorporates their ProtecTIER technology, EMC and HDS incorporate DataDomain technology, while Oracle does not currently support Mainframe dedupe, incorporating a Virtual Library Extension (VLE) as a second tier of VTL disk storage. Ultimately dedupe delivers significant (~10-20:1) data reduction benefits and arguably is mandatory for any large scale Mainframe VTL implementation.

Each and every business must draw their own conclusions for VTL implementations and whether they should be tapeless or not. Most Mainframe users have experienced the benefits of mirrored disk (I.E. IBM PPRC, EMC SRDF, HDS TrueCopy, XRC, et al) and have implemented high-availability solutions with a short-term RTO for physical failures. However, only that business can consider how robust their data recovery processes are for logical data failures, and in the worst case scenario, restoring an entire Mission Critical application from a backup copy. The driving factor for this type of recovery is RTO and where is that “last chance” backup data copy stored, tape or disk storage media, and local, remote or 3rd party data centre?

Just as the business must establish a 1st level RPO and associated RTO for their Mission Critical database structures, typically via a quiesce Point of Consistency (PoC), they must do the same for their 2nd level backup data. If a VTL destages data from disk cache to physical tape, then the time required to create the final physical tape copy will influence the associated RTO, and potentially how much data loss might occur. For the avoidance of doubt, if backup data cannot be detstaged to physical tape, then the backup has not been completed, and is unusable. Ultimately data loss is not acceptable, whether a database, or a backup copy. So what steps can the Mainframe user take to minimize this risk?

Because tapeless VTL solutions can attach to any disk subsystem, within reason, IT departments generally have their preferred disk supplier and associated processes. Data dedupe significantly reduces disk acquisition cost and associated network transmission costs, while the functional abilities of disk subsystems are typically higher (I.E. Mirroring, Replication) and more robust when compared with tape subsystems.

If the typical Mainframe user has confidence in their disk mirroring solution for physical failure scenarios, generally associated with the primary copy of Mission Critical data, it seems a logical conclusion that they could extend this modus operandi to secondary (E.g. Backup) copies, eradicating if not eliminating any data loss concerns.

If the Mainframe user deploys EMC Symmetrix (VMAX) for disk data, they could deploy the DLm 8000 VTL to benefit from SRDF/GDDR functionality; if they deploy HDS USP, they could deploy LUMINEX gateways to benefit from TrueCopy functionality, and so on. There are many options available, when the front-end host connectivity (E.g. FICON, virtual tape drives) is separated from the back-end data store (E.g. IP/FC/iSCSI disk).

Additionally, the smaller Mainframe user that cannot afford hot/warm site recovery facilities can also consider different options for Disaster Recovery solutions. For example, they could deploy a tapeless VTL in their only data centre, benefitting from data dedupe for data reduction, transmitting their backup/archive data via IP (or other network transmission) into a 3rd party suppliers facilities, duplicating the VTL and disk subsystems to store the data. They can then modify their Disaster Recovery (DR) procedures to invoke DR as and when required, at that point connecting the 3rd party Mainframe resources to the VTL and data recovery can start immediately. Therefore the traditional off-site DR test at 3rd party provider premises increases in efficiency, while backup data availability is not reliant on the Ford Transit Access Method (FTAM)!

So, how long should secondary copies of Mission Critical data be retained on Virtual Tape disk? Is it minutes, hours, days, weeks or indefinitely? The jury might still be out, but to deliver near 24*7 data availability, for both logical and physical failure scenarios, seemingly at least one secondary copy of Mission Critical data should be retained indefinitely on Virtual Tape disk…