Why Having a Backup and Recovery Plan Improves IBM i Vitality
Expert Debbie Saugen explains how to develop a backup and recovery plan and monitor the vitality of your IBM i system
By Jennifer Goforth Gregory03/26/2021
Monitoring the vitality of your IBM i system is a cornerstone of keeping your business running 24-7. A key component of IBM i’s vitality includes recovering quickly after a disaster with the least amount of disruption to your customers and revenue streams.
Debbie Saugen has heard all the excuses for why a company isn’t prioritizing backup and recovery for their IBM i system. “Everyone says they aren’t going to have a disaster, their business isn’t like our other customers’, or they don’t live in a disaster-prone area,” says Saugen, owner of Debbie Saugen Consulting. “But I've worked with hundreds of customers in real-life disasters over the years. It’s not just natural disasters—it’s hardware failures, it’s user errors and now it’s a problem with security. If you have a breach, you may need to recover our system in order to recover from that security breach. Every company needs a backup and recovery plan and solution for IBM i.”
Saugen worked at IBM for 38 years. Her roles included technical owner of backup recovery for IBM i and leader of the IBM i recovery team at IBM Business Resiliency Services for 18 years.
Moving Backups to a Virtual Tape Library in the Cloud
Saugen finds that most IBM i clients currently back up the IBM i locally to physical tapes. Recently, she met with a client that backed up their IBM i system and data every day to the same physical tape, which she says is an especially poor strategy. By moving to the cloud, clients have their IBM i data off-site and have multiple copies of the data, which can greatly improve the quality and speed of recovery.
As clients realize the benefits of cloud backup, Saugen is increasingly helping organizations transition to a virtual tape library (VTL) solution—especially businesses in industries that need 24-7 availability, such as healthcare, banking, manufacturing and hospitality. With a VTL, the organization first backs up locally, and then the virtual tapes are replicated remotely to the cloud. If needed, you can then quickly restore files or recover your local production system, but also have the replicated VTL in the cloud for disasters that affect the physical location.
Businesses find this transition to be a good solution because of the high performance of the combined backup and off-site cloud storage for recovery. Because the backup data is off-site, you can easily and quickly recover your system after a disaster. By implementing a VTL solution, Saugen helps clients use IBM Backup Recovery and Media Services (BRMS) to implement parallel saves to improve performance on both the backup and recovery side. The improved performance and scalability of the cloud makes the backup and recovery solution more viable.
“When talking to their executives about the new process and the VTL, I coach my clients to explain that by reducing the backup window they can keep the systems up longer. Because the data is off-site and recovery doesn’t involve shipping physical tapes, they can begin the recovery immediately, instead of waiting for the tapes to arrive, which means a much quicker recovery,” says Saugen.
Her clients often report that their executives will have concerns about the costs. Saugen explains that they need to start by looking at the business impact analysis, which shows how much it will cost per hour when the business is unable to run. By then comparing this amount to the cost of the cloud-based backup-and-recovery VTL for IBM i, the business value quickly becomes apparent.
Understanding Business Needs for Recovery
Once you decide to move to the cloud, Saugen recommends focusing first on recovery and then backup. Because the entire focus of recovery is getting your business back online and fully operating, it’s essential to understand exactly how quickly you need to recover your IBM i system.
She explains that the first two decisions should be the recovery time objective (RTO) and the recovery point objective (RPO). By jumping into the process without fully understanding business needs, companies can easily select a solution that lacks the performance or the recovery time needed to continue serving customers, which means further financial and revenue damage.
“Clients need to first understand the ideal recovery time for their business and, even more importantly, how usable the data is when they get it back,” says Saugen. “When looking at the various cloud solution offerings for IBM i backup and recovery, you need to compare how those options perform compared to the RTO and RPO needed for your business.”
After designing her clients’ backup and recovery for IBM i, Saugen advises clients about the three keys to success: test, test and test. One of the biggest mistakes she sees clients make is creating a backup solution and then replicating the backup, but not putting the entire IBM i system back together for a full recovery using the data backed up. She says that almost every time a client runs a test, they find issues that need to be resolved.
“Unfortunately, I’ve seen many clients who didn’t do a full test find out that it takes a week to get the IBM i system back up and running or that they are missing key files. Then they have a real disaster—and they find out that they didn’t have a good solution.”
When reviewing logs and testing recovery, look for locked files or files that weren’t included in the backup. Saugen sees many clients verify that the backup occurred, but not that all the necessary files were actually backed up.
Saugen recommends testing your recovery of IBM i at least twice a year. During the testing process, follow the documented recovery procedures and make any needed changes to the document during the test. She says that while IBM i provides the “Save” menu options and BRMS for the backup, if you’re not checking your logs and testing your recovery then you may have an exposure during a disaster.
Many companies only think about backup and recovery when they’re facing a disaster. By taking the time to plan now for the inevitable disaster for some time in the future, you can improve the vitality of both of your IBM i system and your business.
Jennifer Goforth Gregory is a freelance writer.
See more by Jennifer Goforth Gregory