Raising the Bar on IBM Z Resiliency With GDPS
GDPS solutions provide the capability for capture and use of multiple point-in-time copies of data to protect clients' systems of record from logical data corruption.
By David Clitherow07/12/2019
With the recent push toward digital enablement and always-on infrastructure, organizations are striving to ensure their IT infrastructure can deliver service irrespective to any outages or events that befall them.
The IBM Z* platform provides industry-leading availability characteristics based on the inherent strengths of the overall platform. Technologies such as Parallel Sysplex data sharing and data replication are fundamental to providing this availability, and they can be further enhanced by deploying automation and orchestration solutions such as GDPS.
GDPS is a family of solutions designed around specific requirements of different IBM Z implementation patterns or topologies. GDPS solutions take into account the nature of the data replication in place, including the number of copies, and other availability-related requirements that the operating environment dictates.
GDPS solutions also provide the capability for capture and use of multiple point-in-time copies of data to protect clients’ systems of record from logical data corruption either through cyberattacks or internal malicious damage. The solutions fall into the following groups based on the underlying replication technology being used:
GDPS Metro Solutions
These solutions address the requirements for clients using synchronous replication technology known as Metro Mirror. GDPS Metro is tightly integrated into the Parallel Sysplex clustering technology that provides near-continuous availability characteristics for the z/OS* environment. GDPS Metro not only provides replication management, but also systems management and sysplex management functions to the Parallel Sysplex along with sophisticated workflow that can deliver repeatable sets of actions to react to situations that may transpire. The solution is capable of delivering a recovery point objective (RPO) of zero (no data loss) and a recovery time objective (RTO) of minutes to under an hour, depending on the topology and workloads.
GDPS also provides and coordinates the HyperSwap* function, which transparently switches the host I/O from the primary copy to secondary copy if the primary copy is subject to a failure event. This is done within a few seconds, typically with no disruption to workloads active in the sysplex. This same capability is also available to z/VM images and guests such as Linux* on IBM Z, and to z/OS outside the GDPS sysplex via the GDPS z/OS Proxy feature.
In the event that one site suffers a major outage, the worst case is that clients need to execute a script of a handful of statements to recover the systems into the remaining site.
GDPS Global Solutions
As the name suggests, GDPS Global solutions use Global Mirror or z/OS Global Mirror, which are both asynchronous replication technologies to replicate the data over unlimited distances. They provide replication management and monitoring of the steady-state environment along with the orchestration, through powerful workflow, of recovery actions in the disaster recovery (DR) region in the event of a production region disaster. GDPS provides the ability to toggle between these two regions if this is a stated requirement. With asynchronous replication, some degree of data loss is always expected in unplanned scenarios, but this is typically in the low seconds range. RTO will be 30-60 minutes, depending on how long it takes to restart systems and workload in the DR site.
GDPS Metro Global Solutions
These three- and four-site solutions combine elements of the Metro and Global solutions to deliver Metro distance near-continuous availability along with region DR capabilities. With this combination of three- or four-site solutions and capabilities, clients can deploy a solution that addresses the best of both worlds. Local resilience will be protected from single or multiple component failures within the production region, while out-of-region protection is provided by GDPS Global capabilities. Moving to a symmetrical four-site configuration provides the flexibility to switch between regions and run production in either with the equivalent highly resilient profile. Certain industries, particularly those focused on the most stringent availability targets, such as the finance industry, are being regulated to demonstrate the ability to run in both their normal and DR locations for extended periods of time to validate stated capabilities.
GDPS Continuous Availability
The GDPS Continuous Availability solution uses software-based replication of data provided by IBM InfoSphere* Data Replication for z/OS for Db2*, IMS* and VSAM data types. It provides low recovery times for the most critical workloads in a client mainframe environment. This is achieved by having a sysplex active at all times in two different regions that can be separated by any distance, so long as the software-based replication can be provisioned (along with network connectivity) between them. No requirement exists for a clustering solution between the regions. GDPS provides monitoring and orchestration of various components that comprise the overall solution, ensuring that, when a failure is encountered by a workload, that workload is switched to the alternate region with the minimum delay. For a visual of GDPS configurations, see Figure 1.
Automation and Orchestration Are Key
Automation to remove the need for direct human intervention at the time of an event occurring, and orchestration of recovery actions are both critical to achieving the highest availability levels possible. GDPS solutions are designed to provide automation as close to the objects being automated as possible for maximum efficiency. GDPS also provides a powerful workflow engine and a simple script language to orchestrate actions often required in either planned or unplanned events in a repeatable and predetermined way.
Even with the most available systems infrastructure we have today, GDPS solutions increase the resilience of the overall environment and enable hundreds of mainframe clients to meet availability and DR objectives.
David Clitherow is the global offering manager for the High Availability Services portfolio within IBM Global Technology Services.
See more by David Clitherow