Putting a Plan in Place for Cyber Resiliency
Milt Rosberg, global VP of worldwide sales, marketing and business development at Vanguard Integrity Professionals, on how to ensure cyber resiliency
Reg Harbeck: Hi, I’m Reg Harbeck and today I’m here with Milt Rosberg, who is the global VP of worldwide sales, marketing, and business development at Vanguard Integrity Professionals and deeply rooted in the security industry since 1999 with also risk management at IBM. Milton Rosberg has worked with Vanguard Integrity Professionals as a consultant since 2001 along with other industry leading companies that provide solutions for ACM2, Top Secret, and RACF. Other areas of expertise include zed/OS or z/OS, vulnerability engagements, SIM con activity, compliance audit solutions, and system migrations. Milt, welcome.
Milt Rosberg: Well, thank you very much for the introduction, and look forward to chatting with you today.
Reg: Now Milt we’ve talked before and I got a bit of a sense of your background already, so rather than digging too deeply into how you ended up here in the mainframe and security space, I’d like to really focus on our topic today, which is cyber resiliency as a perspective of an organization’s real DNA. I think one the challenges that we have, as we look at the mainframe certainly but more generally in IT, is the whole idea of resiliency given the growing number of security threats and types of security threats. So I guess maybe if we could start by getting a sense from you—what is the lay of the land right now? What are the key issues that organizations and particularly mainframe using organizations have to be considering as they ensure that they’re set up as properly resilient in a going forward battle?
Milt: Yeah, it’s an interesting topic. As an organization, we spend a fair amount of time with senior executives of Fortune 100 and 500 companies, and they’re always looking at a way that they can answer to the board of directors in a very concise way that their systems are compliant, that they are secure, and that their customers’ information is safe. It’s a real battle for companies to make sure that they don’t end up being on the front page of the New York Times or splashed on the news that it was another hack. Many of the large corporations around the world have had their data compromised from a variety of different ways, so the organizational DNA has to be such that it’s more than just an idea. It has to be a real plan on how you’re going to implement security in your organization. It’s an interesting topic because if you do any of the analysis on the amount of pressures that is put on the CSOs—and there’s a lot of articles in this particular space. They have the gun to their head all the time trying to make sure their systems are secure, and the competition of those that are trying to get into the systems they're coming from like nation states. There are hacktivists and in the future it’s all going to be—there’s artificial intelligence stuff. It’s going to come at the speed of electricity, and so the idea is that you can get a plan in place that’s going to do everything that you possibly can to have a structure that’s going to protect the organization’s data. One of the things that we’ve been asked to work on with large clients is helping them understand what they want to have in their baseline, developing a baseline for them, and then help implement the baseline. This is a starting point for an organization to really have a good feel for what kind of information should be in the baseline so that they have a place to start and do the kind of checks that they need to do all the time.
Reg: Now baseline, I think just to make sure that everybody is on the same page with the baseline because that’s such a key concept and of course these days it’s not just a single snapshot of how you environmentally set up when you first install. That has to be a dynamic thing, but it’s sort of a best practices where you’re at that—you’re kind of moving forward from. Maybe if you can elaborate just a little bit on how you characterize a really good solid base.
Milt: I’ll just give you a couple of examples that we’ve experienced over let’s just say the last few years. We’ve been in the baseline business if you will since 2009. We did the DISA STIGs and we’ve built the DISA STIGs for RACF, ACF2, and Top Secret. We also have clients that come to us that say you know what? Our organization is completely different than any other organization and what we would like to do is take our best practices, our audit requirements, and our security requirements and build that into a checklist and automate the checklist for the mainframe and make sure that that checklist, when it’s run, it produces the results of what the expectation is for that particular check. Let’s just say who has access to authorized libraries, and then it takes the result of that and it gives you the findings if those are correct. We have some clients that are doing as few as, let’s say like 150-160, and other clients are doing 1600 checks to make sure—
Milt: That they’re meeting their corporate requirement. So it just depends on the environment. It’s not a simple process but it’s an important process, and it should also be done not only on the mainframe but any open systems architecture that you have so that you can give the audit group, the internal audit group, a place to start to really make sure that they have a good understanding if the system that’s protecting the information meets the very basics of what the company wants to have so that they have a way of measuring their success.
Reg: So, I hear a few different threads in there. On the one hand of course, just something somewhat akin to the IBM Health Checks, but on the other hand, security technical implementation guides or STIGs—you know, providing an industry recognized best practice baseline. But the third thread that really jumps out at me is just the nature of automating this because I’m going to guess that the amount of manual involvement when there has to be some has to be somewhat minimized just so that you can do this regularly with a fair amount of detail and depth. So how do you see those coming together?
Milt: When we first started building the DISA STIGs, we worked with a client that had three people using a tool that just does the audit. They had to meet a government requirement every quarter to get paid for the processing. They would work for one quarter with three people just trying to get some idea of how close they were to the DISA STIGs. They had to sign off on it and in order to get their monthly payment from the government to do the processing. It took the full quarter and then they had to restart again the next quarter. The bad part is that they were never 100% accurate, yet they had to sign off as if they were 100% accurate, and that’s what computers do extremely well. So the first thing that you would want to do is make sure you define every part of the system that you want to look at. The other thing is you want to do a full assessment on your system to get a feel for what access people have and how it works. You’re going to do a full security assessment on your mainframe computer. There’s a lot of companies that do that: Vanguard does and IBM and Broadcom and others. A lot of independent companies do this successfully to get a real baseline on where you are on your system to make sure that you are meeting the standards that you want to have for security and that you don’t have any back doors just on your system right away. But the advantage of building the baseline and then running the checks against the baseline, it’s going to give you results and when you get the results, those go into a result file. They get collected across your enterprise. Let’s say you have 100 LPARs as an example and then you can push those out into various forms. Like you can push them out to a SIM technology, you can push them to internal reporting, but you need to get it in an easy way for senior management to get a snapshot of where the system is at this particular time so that when they look at it, they go, okay, now I have a feel where our system is across our enterprise, even if it is only 10 LPARs or it’s 15 LPARs. I get a real feel for what we have to work on and how close we are to the baseline that we establish for our company. I think that’s really the key, to get a real feel for where you want to be and where you want to go. It’s never going to be 100%, but at least you can start getting the controls built in place to get you there.
Reg: Now of course one of the issues about controls—and perhaps especially, control policies for things such as patches—is that these baselines identify gaps, and you know quite often ongoing emergent gaps and the necessity of keeping patched, which is something on the mainframe we’ve always been really careful. One of the things that you learn as a new mainframer is the difference between leading edge and bleeding edge—or even not leading edge, because your company wants to be really careful. Suddenly that goes out the window if you want to be current, so even if you can just talk about patching policies as a compliment to these baselines.
Milt: Yeah, there’s really two pieces to the baseline. Let’s say you run your baseline. We’ll do something simple. Let’s say you have 100 checks and you run them and then you get the results file—it’s collected on each of your ten LPARs. You push that back and you find out well, out of my 100 checks, I have ten that need remediation. So you want to go ahead and make sure you do the remediation for those ten checks. Then you want to establish a way that you have something that manages the policies to make sure that now that you have the ten checks remediated, so you don’t end up getting—I’m going to call it creep—across your organization where that problem arises again. Some of that can be human element. It could who has access, who has the right people to look at things. So you have to have a policy in place that’s actually going to look to make sure that after you complete doing the remediation, which could be extensive in some cases, that those problems are not recurring. So that’s one part of it. The second part of it is what we’re seeing from our clients is they want to put in place a way to get the controls so that when you have patches put on a mainframe—and we had a couple of clients come to us recently and this is an interesting story. On the mainframe typically you would do your patch updates like once a year. Then they said well, we’ll do our maintenance patch updates every six months. But now we’re getting to the place where it’s being pressure that they want to make sure that whatever the releases are for the maintenance or the patch updates every month that’s being evaluated which is a huge task. If you take a look at a company that has—let’s say a large enterprise of 20 LPARs, which is a lot to fool around with. You don’t want to have to go back to each one of these LPARs to make sure that all the patches are put in place and they meet the requirements. The interesting part about this, Reg: it’s being driven by the CSOs, and the term patch was typically applied to open systems architecture. It wasn’t something that you would see was applied to the z/OS architecture, let’s say in the last 15 years. Now it’s being pressured where you have to make sure that all your LPARs have the latest fixes and the right software installed—the right revision, everything is correct, everything is up to date—and so the patch policy is in harmony if you will with making sure that your baseline and your checks are in place so that your organization from the top down feels comfortable that you’re doing everything possible to protect the system.
Reg: Now, and that’s important but of course as you know on the mainframe we’ve got decades worth of habits about doing our SMPE—you know, receive, apply, test in the test environment, set up a weekend when you’re going to install this—and this seems to have just kind of flood in over top of that, and so getting into a completely new habit, and not just under the security software because you know as STIGs show us that every piece of software on the mainframe in some way touches security. That’s got to be something that takes a fair amount of investment to get a mainframe organization up to this new level of rapid functionality just in terms of maintenance, acquisition, applying, testing, and installing.
Milt: Yeah, the part that you mentioned about making sure that all the implementations is going across the system, that was human dependent. If you think about it, the staffs, the size of the staff—when we work with our clients, the size of the staff that is running the mainframe system, it had a lot of people in it. Now you’ll see cases where it only has maybe four or five subject matter experts—
Reg: Yeah, yeah.
Milt: That try to bring all the other people up to speed. As you know, we hold a conference every single year. It’s a security conference. We invite the vendors in—it’s not a product conference. Vanguard is not pitching tools. We’re actually talking to companies about securing their system and doing the right kind of audit things. We bring IBM in, Broadcom, other companies come in and they speak and we’re all trying to do the same thing. We’re trying to make the system hardened and as healthy as possible, but the number of new people that we get into conference is surprising. We always thought, you know, our conference size is going to shrink. The number of new people that are coming in is like 30-40%. These are people that are coming from other departments. Maybe they could be the open system side or they’re a Db2 person or they’re a system programmer doing some other kind of application work, and now they’re coming to the security spot. Why is that? Because of the youth of the people that are retiring out and the number of people that have met their window of time—let’s say it’s 25 years—and they say I’m going to be out of here in two years, you better find a replacement for me. And all that subject matter expert information that they’ve been gathering for all these years is going out the door, so the vendors, all of us have to work really, really hard to try to automate as many of these processes—like patch control as an example or updating your systems in the right way or developing tools that can measure against the baseline, or do as much as possible to make sure your databases are in sync. All these things need to be put in place because the people that have the knowledge are just leaving the industry. They’re just not there.
Reg: Hmm. So basically it sounds like, among other things, the need to get a new generation on the mainframe, which is something I’ve been flying since 2004 when I wrote a white paper about it, that that is merging in with this whole other set of concerns, because of course everything does touch on security. So I guess as a CIO or somebody making a decisions for an organization, that you have to be looking at the budgets from several different areas—not just the budget for your security software and security people, but your systems people, your new people, your training, and then of course the cost of changing corporate culture and behaviors in order to be acquiring and testing and applying maintenance, accepting maintenance at a much greater pace than back when there were a lot of mainframers. This must be quite a significant cost to an organization, plus the cost of ensuring that you’re ready to deal with when there is an incident.
Milt: Yeah. I’m just going to say this candidly but clients that we work with, they’ll say I can’t get any budget for this and I can’t get any budget for that, but at the same time the audit pressure that’s being pushed on these mainframe shops is huge. There’s been so much attention put to the open systems architecture, and I’m going to call it perimeter security—
Milt: That’s been around the mainframe in the past. They are the ones making the most noise and they’ve got the most amount of money or the largest percentage of it, but if you take a look at what’s going on on the mainframe, one of the places that people are very concerned about, and I think the increase has gone up substantially let’s say in the last three or four years, and that’s the concern for people who are the authorized user—
Milt: And the authorized user on the mainframe—if I was going to hack a mainframe, the person that I would want to go ahead and hack is the person that’s been there 20 years. I would do all my social engineering to figure out a way that I could get into that system, and if you take a look at the nation states, they have all the resources to do that, although having the controls in place to take a look at privilege access monitoring, not the management part but the monitoring part to see those that have privilege access what they’re looking at, when they’re looking at it, is there something out of balance that doesn’t make sense, and to take a real hard look at it is important. That also has to be an element that’s in the organizational DNA of securing information. I just think a few years ago, let’s just take ten years ago or eight years ago, the person that had the privilege access monitoring—or management rather—and they have access to do whatever they needed to do on the system. Let’s just say it’s a high-level systems person. You were never worried about that. Now we’re seeing where they are putting in multifactor authentication. They’re putting tools to take a look at the monitoring. They’re really using all the stuff that’s out there today where you can really view what the person is doing—not because you don’t trust them. You still have that trust in the person, but you want to make sure that you can take a look at what they’re looking at so if somebody gets in the wrong place, we have a way of monitoring that if they shouldn’t be there.
Reg: You protect your best people by being able to prove that they haven’t been up to anything. That’s what separation of duties is all about. So given that, maybe if you could take a look at the whole idea of putting this all in place into a cyber resilience configuration core environment—you know between an employee who is in training, ongoing security assessments, penetration testing. What does this look like?
Milt: I think from the very start, you need to take a real hard look at how you want to make sure your culture of the organization is protecting each of the elements. On the mainframe side, there is so many different ways that you can do this, but you need to develop a plan, and the very first part of your plan has to be that you understand where your system is today. Of those cases where we go in and do, let’s say like a security assessment, when we go through where the system is and what needs to be worked on and how it needs to be repaired. Sometimes that’s a little painful, not because it’s not a good company or they don’t have a good organization, just the fact that—maybe they’ve done some acquisitions, a fair amount of turnover in the organization, and now how are you going to harden that system? So the first place that we strongly recommend is you do an assessment to figure out where your system is. How does it really look? Then you need to decide, once you do the assessment, you develop a plan that makes sense. The employees that are going to have their hands on the system. They need to have full awareness of what they’re looking at, how they’re looking at, and what they’re doing. Of the privileged users that are on the system, one of the bigger problems is lack of employee training.
Milt: So, they’re getting on the system. They’re allowed to get in the system and do things, not on purpose that they’re making a mistake, but they just don’t have the knowledge yet. So you need to have a very, very good program doing the training. The last part of that is you ought to do regular pen testing. You want to take a look at your system not only on the outside, but you want to look at it on the inside. You want to do everything you can to make sure that your system is solid. So it’s just not one thing. You want to have your baseline in place. You need to go ahead and make sure you have remediation in place. By the way of those systems that we work on and we give an assessment to, sometimes a remediation can take years. You’re not going to get it all done in one day, so there has to be a way that you can explain this to senior management or management, that you’re going to have to spend this amount of money. One of the obstacles that we see is that some people say well, I’m not sure we even need to bother to do that. We’re going to be off the mainframe in three to five years. I think it was ’99 or ’98—I don’t know who the guy was who said everybody is going to be off the mainframe by 2000, something like that, and he wrote an article about it right?
Milt: Everybody is going to be off the mainframe. Then the latest thing we hear is we’re going to be off the mainframe in a couple of years and we’re going to go to the cloud. Just because you’re going to take and change your architecture to go to the cloud, you still need to make sure that your systems are secure. Or you’ll hear, I’m just going to go ahead and outsource it. We’re not going to bother with it anymore. I’m just going to give it to somebody else to worry about. It doesn’t change the fiduciary responsibility of protecting your data. We work with a lot of outsourcers and all of them are very serious about doing the very best job of protecting the data, and you need to make sure that that becomes part of your DNA. If you plan on moving to an outsourcer, it has to be a well thought out plan. Put the policies in place that you want to control for your company and the software that locks it down so that people can only do the things that they’re supposed to look at. Then if there’s a plan to put it up in a cloud, you need to think about that also. Many of these large outsourcers are providing those kinds of services, but it doesn’t change the fact that you still need to have your information secured. You still need to have things in place to do the monitoring of who has access, you still want to go ahead and have an incident response plan in place that you practice on a regular basis. You want to make sure that your patches are put in place by the outsourcer, or you do it yourself. All those basic things that you’re using to run your business and protect it from a problem, they’re still in place whether you outsource it, you stick it in a cloud, or you marry up with another company and use it together.
Reg: Yes. So maybe in just bringing this all together, as you think about an organization—you don’t have to name them of course, but that you’ve worked with, that you’ve helped go from relative unawareness of their environment to a clear baseline to monitoring to implementing, what that indicates to getting to an ongoing monitored automated baseline that is relatively current and kept current with security requirements. Can you think of an organization as an example that you’ve done that with?
Milt: Yeah, we have several of them and one of the things that’s really helped I think is the SIM technology today gives us a way to communicate what the results are. So we have some very large clients that are now in a place—let’s say you have a baseline put in there and you gather the results from the baseline. We do this—it’s called aggregation and delivery. So we can aggregate the data from each one of the LPARs and we can put that in a way that we can send it off to the SIM technology, and that can be available for executives and management. They could look at it down to their cell phone if they need to. The technology, we make it so it works with any SIM. The other part of that is that you need to have a way of sending outt’slet’s say your baseline is going across 100 LPARs. You need to have a methodology where you can ship all the new baselines out to each one of the LPARs. Particularly if it’s shipped across a worldwide organization, they need to get a baseline in all of their LPARs so all of them are being evaluated equally. Then the other part of that is you want to make sure if you have patches that need to be put in place, they also need to be put in each one of the LPARs. So you’re normalizing each of the LPAR security, what that needs to be monitored and put in place, but on top of that if you have other application stuff that you’re putting out to each of the LPARs, you want to make sure they have all the latest releases. If you take a look at all the latest hacks that are going on, not all of them but many of them, what they’ll do is they’ll actually look for the vendor and try to find—SolarWinds I think was an example of this, where people want to infiltrate SolarWinds and SolarWinds goes out, and they’re an authorized person that is sending information out to the system. You need to make sure that you have a way of doing this in a very organized, protected way. Other clients we work with, they pay very close attention to make sure that they do the proper checks and change management. They get it loaded up and then they ship the new baseline out to each one of the LPARs and then validate that that is in place. They go ahead and run the scan to get the results. They take the result file and they bring it back. Then the next place you would go is use the same kind of technology and get your patch control in place. Those are just the very beginnings of the things that you need to do to make the mainframe work almost in an open systems architecture environment. This has been going on on open systems for years, and it’s important you get the mainframe to work easily. It’s back to the very thing we talked about in the beginning: where are the resources going to come to run these systems? You’re not going to replace the mainframe, stick it in a cloud overnight. If you’re going to do some of this stuff, it’s going to go on three or four or five, maybe ten years, and in between now and that ten-year window you want to make it run operationally efficient, but you want to make sure you are running it very, very secure.
Reg: Well, this has been a really fascinating and important discussion and description of what it means to really get your organization into an ongoing state of cyber resiliency. Milt, maybe if you can tie this all together with any closing thoughts you have about what people should be keeping in mind with their organizations as they move to the next level with their mainframe and with their whole IT environment in terms of cyber resiliency?
Milt: Yup. What we’re finding because of the size of our organization—honestly, we do a fair amount of custom development, and the custom development requests are coming from customers that are looking for different ways to solve a known problem they’ve been trying to fix for a number of years. And some of those cases, they just want to take standard software and try to squeeze the round peg into a square hole to make sure they solve the problem. This is not throwing stones at any vendor. I’m talking about the mainframe as a system in developing different ways to solve the problem. They need to think about, how can I really fix this issue, how can I really solve this problem, and then come to the vendors and say I really want to fix this problem. I’ve got to fix this thing. What is the best way to help protect the information on the system? We’re seeing a fair amount of this coming into play from the senior level people that are really the true security architects for the organization. To grow a good security architect or find a good security architect, many times they don’t come from the z/OS environment, so there’s a fair amount of education that you have to bring forth, but there are no rules, right? It’s a matter of what you want to solve, how you can program it, how you can develop it to meet the personality of your organization, and none of them are the same. We go across all these different organizations, none of them the same and a lot of that is because of mergers and acquisitions and the nature of the industry that they’re in and the amount of rules that are put on them. They need to come to the vendors that they’re working with and say I really need these five things to make the big leap. I’m at the five-yard line and I want to get to my goal. How can I get the other five yards? How can I get to the goal line, how can you help me get there with what we have today, and what do we need to do over the next two or three years to get there? Probably the most important message as a vendor that we can provide is don’t X out any idea. There are no stupid ideas. It’s just ideas that need to be talked about.
Reg: Excellent. Well thank you so much, Milt. This has been outstanding.
Milt: No. I’ve just got to thank you very much for the time and I want to invite everybody to take a look at our security conference we’re going to have this September. It’s a training conference on security. You can go online and take a look at it. We’d love to see you there.
Reg: Excellent, and I understand if people want to learn more about building cyber resiliency into their organization’s DNA, they can visit your website. I’ll be back with another podcast next month, but in the meantime check out the other content on TechChannel. You can also subscribe to their weekly newsletters, webinars, e-books, Solutions Directory and more on this site. I’m Reg Harbeck.