Mainframe Security and Pervasive Encryption With John Connors and Milt Rosberg
By Vanguard Integrity Professionals / October 26, 2022
Vanguard Integrity Professionals president John Connors and global vice president Milt Rosberg on the impact of convenience outrunning security, skeletal staffs and other daunting challenges
John Connors: Sure. This is John Connors. I’m the president of Vanguard.
Milt Rosberg: Yeah, I’m Milt Rosberg, global VP.
John: And yeah, welcome everybody. And like you said, we just got back from a conference last month. We did our annual security conference, and it’s a security conference based upon z/OS primarily, but mostly security stuff. As a matter of fact, one of the highlights that occurred there was one of our competitors, Carlos Flores from Broadcom, was given a lifetime achievement metal from the chairman. So we had a really good time. It was a very good event. We had a lot of folks there, a good training event and a good learning event for everybody.
Milt: Yeah, internationally we had a fair number of participants: banks, insurance companies. One of the things that was interesting: probably about 30% of the attendees was a first-time visit.
John: That was good too, yeah.
Milt: It was real good, and so that helped us get a feel for the people who are trying to learn RACF, ACF2, and Top Secret. They wanted to be there and understand security. We also had a big audit presence, which was a little larger than the last time. So it was good to see everybody and not do a Zoom call.
John: Oh yeah the last couple of years, it has all been Zoom. It was great to have an in-person event again.
Reg: I believe that. So that said, I gather that you’ve got some other stuff coming up in the near future as well, including going to conferences outside of Vanguard?
John: Yeah, Milton and company are going to be at GSE in the UK. Is that in the UK or is that Germany this time?
Milt: No, no. It’s in the UK.
John: UK this time—that’s the week of the fourth of November.
Milt: Yup and we’re looking forward to be there. The attendee list is good size. They’re completely sold out at the hotel, so it should be a good audience. A lot of the major banks and insurance companies in the UK will be there, and some of the other vendors are using this as an opportunity to bring people in all over the Europe. So it should be a well-attended learning session.
Reg: Well one of the things that we sort of talked briefly about on our last podcast but really is such an important part of the growing and development of the mainframe as the platform of record is the fact that we have so many critical resources and so many critical people with access to those resources. And so too often the response of that has just been give the trusted people all the access they need to get their job done, and in a world where people can be sneaking onto your mainframe from the outside or not being as trustworthy as they should be on the inside, that’s no longer an appropriate response to security. As I understand, you guys have been doing some work on helping organizations get a much deeper, more reliable handle on things such as privileged access. Maybe if you could talk about that and some of the other things that you guys have been working on with that.
John: Sure. So when you talk about privileged access, you’re really talking about like identity fraud and stuff that’s all internal from the ability of a person who has privileged access. Let’s talk about what that is. So privileged access in our mind is people that have the ability to change and/or manage the security of a system. That includes all the data, the objects, the rules, all the different things on that. And you’ve got to be very, very careful with the terms privileged access management or privileged access monitoring, because you find a tendency in this industry to use the same acronym. So be careful when you’re talking to folks if you see an acronym called PAM. It actually can mean both of those things depending upon who you are talking to—the privileged access management or the privileged access monitoring.
Milt: They can also find it depending on the platform and the vendor software provider and how it was built and how it was organized and in the z/OS environment. We use a variety of tools to help with the management part and then we have some really powerful tools we use on the monitoring side, so that’s kind of grown over time.
John: And we’ve got to be careful too because privileged access management and privileged access monitoring are actually platform-specific. You’ll have people say well, we have a monitoring solution for our entire enterprise, and then you actually dig down into the covers and you find out that well, that may not be true. That’s more like an SOC or a notification system than it is a management and monitoring system. It's not platform-specific.
Reg: Of course, it’s like LDAP where you know it’s trying to treat all securities equal, which is nice for a handy dandy way of authenticating across the organization, but when it comes to in-depth management and security, all security is not equal, is it?
John: Well, that’s true. LDAP is a good example. There’s a couple of good authentication protocols: LDAP, RADIUS, a couple of other ones that are out there, active directory. All of those things are independent and are great for authentication, but authorization and access is a totally different thing. Typically, z/OS, UNIX, Windows, they all have their authentication separate from their authorization, and that’s where you have to be careful because the access you really want to monitor and the access you want to manage is typically platform-specific.
Reg: Well it’s about the wide range of different types of resources and the different ways of defining them, depending on whether you’re talking ACF2 or Top Secure or RACF, and the challenge of mapping in between them. That must be a really important type of expertise, and you’re talking about the auditors. Even training the auditors in understanding that level of nuance, which is actually pretty critical, must be quite a challenge.
John: Well it is too, even from an acronym point of view. Like you said, Top Secret, ACF2, and RACF—even though they’re all the same ESM on the same platform, which is interesting, right? They’re all z/OS. They are external screen managers for z/OS itself, but each one of them actually re-utilizes specific terms differently. If I think of an RACF file and I think about a profile in Top Secret, one is a group and one is resource. They’re completely different.
Reg: Yeah. So that said—
John: It’s interesting you bring up the others. We attended an ISACA conference probably six months ago, and one of the most important topics that they had is the people that have the privilege to do whatever they want to do. The auditors are being pressed by the external auditors to make sure that the people that have access, they’re able to understand what they’re doing, when they’re doing it, and how they’re doing it. Not necessarily how they were assigned their position and what role they had, but what are they actually doing on the system and what activity do they have and how is that being watched and how are you controlling that?
Milt: And that’s kind of the difference between the two different camps if you think about it, right? Privilege access management is what we historically would have said, what is the role that you have, what is the access that you currently possess? Not what have you actually done with that access, which is what the monitoring part is. So if you think about, I might be an elevated privilege user—for instance system special or you know, have non-cancel in ACF2 or RACF. And you think about those, I can do whatever I want. The fact that I have the privilege is not necessarily the fact that I use the privilege, both nefarious or good. It doesn’t really matter. At the end of the day, it’s what are you doing with that privilege?
Reg: Hmm. So that said, I guess one of the challenges must be to sort of install the concepts in the brains of those who need to make decisions and implement and use these as much as to have a solution then that enables them to be effective.
John: Well I think that’s what Milt was trying to allude to there when he talked about the ISACA. So you think about the auditing positions. It used to be years ago, you would have audit on what do you have access. You would have ownership or privilege access or access monitoring that would say okay, every year I’m going to annually certify what these people have. Well that’s good, but you want to really go the next step. Not only do I have all of these people that have these privileges, and I’ve recorded those and I’ve reported on those and I’ve approved those through a business process, but now what happens if they use that privilege? Do I monitor what they did and the change to the system and the change to the security profile of the company? Is that being reviewed either instantaneous, near real time, or maybe over time at a different level? Those are the kinds of things that the world has changed. It used to be auditors would look at that annual business process. Now they really want to see your evidence based are you doing things near real time, or at least after action real time, so you know those things have come new.
Reg: Monitoring for stuff must be a challenge. I mean granted, the mainframe is not as resource-constrained now as it was back in the early days, and yet you still can’t really monitor everything. Or can you?
John: Well that depends on who you are [laughs]. So there are tools and we actually do have some of those, but there’s two different methods you can think about that. The z/OS platform itself has what they call SMF, which is the system monitoring facility within it. It monitors all the ESMs and all the different actions that are going on if you set it up. That’s the key to it: If you set it up—so for instance, if you tell it to record violations on specific data sets—sure, it will monitor all those and you can do these things. You can alert on them, you can do some after action reportings, you can do annual certification that they were utilized. But it doesn’t give you that near real time on everything unless you spend the time to do it, and there’s that resource constraint you were just talking about. How much resources can you put forward on those things? But you really have to decide what that is. Our accountability for PAM is a little bit different, okay? We start looking at PAM from a monitoring point of view by putting in, if you will, a system of resources that actually looks at things as they go through the security management system, whether it’s ACF2, Top Secret—RACF really doesn’t matter to us. Remember these are all external security managers, so that means you’re actually asking the question through the security manager: can I have access and what did I do with that access? So now you can get into the thing of using less resources by monitoring activity. And we can monitor the activity that happens, and if something happens that’s of interest to us, now we can actually start digging down in. We can throw all the chaff but keep the wheat, if you will, for when we need to have to this by monitoring the actual utilization.
Milt: You know John you bring this up—
Reg: Now the question—?
Milt: One of the things mentioned with the auditors at some of the events that I would attend is the number of records that are being exfiltrated off systems all day long, and one of the strategies of organizations that have the advanced persistent threat is the best thing they can do is try to get the best credentials as possible. So if there’s any chance they can gather credentials on somebody that’s an authorized person to do whatever they want on the system, now they have built keys to get whatever they need whenever they want.
Milt: And I think there was just one with Uber, as an example. They were able to do that on an AWS system—it happened a few weeks ago. So I think it’s a matter of how do we really manage the system as much as we can to find out how you can monitor it to see who is looking at what, when they’re looking at it, and what information could we provide for the security of that architecture.
John: And therein lies the really big problem. You have the day-to-day operations of privileged access users. They’re going about life and they’re doing their job the way they’re supposed to. They’re going to have thousands and thousands of events through a year that are actually authorized and controlled and approved events. And here you have somebody, an APT, some sort of threat that comes in, and hijacks that credential, whatever it is. And there’s many ways to hijack your credential, but they get authority to do something. Now you have a person who may do something that is unauthorized, not approved, and did not go through change management or something like that, and we can trigger on those kinds of events and then go okay, was this the norm? Was it not the norm? Those are the kinds of things you want to look at because just going through SMF, which can be millions up millions and millions of records—
John: And hope that an eyeball got it? No, you want an automated system that monitors the effective use of VSM at the time when it happens, and hopefully intermixes with some sort of control system that you can say, this was an approved change or not approved change.
Milt: Yup and we look at it—I see that we typically have access by user accounts. There are two different types, right? One is human users, and then the other would be automated nonhuman users.
John: And those are the ones that can be compromised the easiest if you think about it, right? We think about the nonhuman. Human users typically are nefarious or they make a mistake—one of the two, right?—but nonhuman users are the ones that I get a kick out of. I was just on a system the other day in Africa that surprised me. They had a nonhuman user that all the humans knew the identity and password. I’m like, that’s not a good nonhuman [laughs]!
Reg: No. No.
John: Well we were talking to a client. They said that they were users of the Robard credentials—
Milt: Uh-huh. Yeah, I remember that, right.
John: We were talking to a client and they said to us we need some help with this. Where their people had taken the credentials of the Robard and logging into the mainframe system and just doing that because it was easier and they could bypass MFK and all the other kind of stuff.
Milt: Yeah, so be careful that we don’t think that our automated nonhuman users are not being used for by human users [laughs].
Reg: But this is such a big issue. Yeah, I mean somebody has access to a script somewhere that’s got a clear text user ID and password and is talking to the mainframe in order to do something and has probably got all kinds of access just so they don’t have to keep adding access to it, don’t they?
John: Typically those automated users that we see out there—and I like the way you put it. There are clear text passwords being stored in some other facility. They might be in a Notepad, they might be in a configuration file, and somebody knows those because you gave them out for them to put them in those files. They reuse them and those are typically extremely elevated users because of ease of use.
Milt: Right. Those are the ideal ones if you were going to jump onto a system and do something you shouldn’t do. Those would be good ones to get.
John: Sure. Remember what Disney had years ago? Somebody had a Notepad with thousands of user IDs in it, and they just went down the list. That’s what happens.
Reg: Now this gets back to not merely educating the individual auditor or security professional or just professional, but the entire culture. And I understand you guys have had encounters with organizations that had a really tight cultures as good examples, and they’ve had you sit down and do some pretty exhaustive stuff. But I’m wondering how does that map to the culture of mainframe shops that have perhaps gotten complacent just because they didn’t have the bandwidth to keep up with the level of scrupulousness they needed?
John: Wow. That’s a mouthful there to be honest with you, Reg. It really is, because you’re got a couple of things in there that I hope we don’t overlook, and that is the bandwidth of the humans that are doing this job. You’re right. Over time they’ve been inundated and overrun with the amount of responsibilities that they have, and that has exacerbated this without a doubt. But you also have the ability of regular business practices, right? You have administrative accounts, you have emergency accounts, you have application service accounts, you have developers that are allowed to run DevOps in different places, and stuff like that. And if you think about all those things, we’ve kind of lost control at some point. And the right way to get control is by the monitoring, because all those things exist for good business reasons. They have to exist, but if we’re monitoring those solutions—not just granting access. So if you’ve got the access manager, one part of PAM, and you’re monitoring all of that doing that, but on the back end if you’re monitoring the utilization of those specific accounts and the things they do and you can trigger off of an event that they did, now you can do a deep dive with less resources over time because you only have to look at the things that matter to you critically. It’s all about the manpower at that point. You don’t have enough time to do every event that ever happened, but if you could focus on and have the ability to focus on this event, this session—this is the thing I want to go down and look at. That’s where real privilege access monitoring comes into play.
Milt: You bring that up, John. We met with a client recently, and it was a bank and they have branches, and inside the branches they would have two or three tellers. The tellers, they all had the same password. I think it was, and the branch manager had the—I’m just going to call it the tele password for the branch. So if you came into that branch, you got the password for the branch. And there wasn’t any real accountability because they had basically elevated rights to do whatever they needed to do to do their business every single day—
John: Oh, convenience sometimes outruns security. There’s no doubt about that.
Reg: We’ve all seen that.
Milt: Then the auditors came to them and said you need to put some controls in place. So we know that the four tellers—exactly what these people are doing, when they’re doing it, and how they’re doing it—so we’re working to help them solve that particular problem. But I almost fell over when I heard about it because you wouldn’t think in today’s marketplace—you wouldn’t have the controls in place, but they have software. You talk about the mainframe and the systems—they have legacy systems that have been there for 20 years. They’ve been running their business fine, never had a problem, but now they may be tripped over one and they want to start solving that kind of an issue.
John: Well I think they trip over things when things happen. And Reg, I think that goes back to what you said. The amount of people doing the amount of things they do, right? So typically if you think about the world today, we’re reactive to when things went bad because of the overload that you talked about, right? All of the things you’re talking about from an access point of view, they’ve existed forever, okay? The only difference is that 30 years ago in the mainframe, specifically z/OS—and back then it was IBM systems, whatever they were called at the time: 360, 370, not to age myself or anything—but when you look at those systems back then, you probably had a staff of 30 or 40 people doing the work that you might have three or four doing today.
John: And we have to have tools such as privileged access monitoring to do that automated work of all those people, because the work didn’t go away. The events didn’t go away. As a matter of fact, they’ve gotten worse. More people now know how to hack a z/OS system because there was a lot of obscurity on this platform for many decades, but now with YouTube, if you go out there and try to do hack z/OS on YouTube, you’ll find 20 examples of actually critical ways to infiltrate this system and exfiltrate information off of those systems.
Milt: And they can carry their z around with them.
John: That’s true too, yeah. You can carry a copy of a z now on a laptop if you want to, and you can learn and that’s another one. I mean you go out there and you look and there’s open-source systems that are out there that let you emulate this platform and practice right? You know just like if you go out to any kind of hacking course that’s online today, there are going to be platforms that you can practice on, including z/OS, which gives you the ability to go out and try these things. If you can exfiltrate a user ID and information, now you can use that and now you can clean more data off.
Milt: So you can come in as the Robard, get onto the system and have a happy time.
John: Absolutely, and if there’s less and less people talking about those, then you’re going to have less and less people doing it. And if you don’t have a good monitoring solution—hopefully you do and maybe you have some solution out there that’s reporting to some sort of SOC or NOC into maybe a security event manager or something like that. You’re going to have to have those kind of things because you can’t maintain any kind of vigilance if you don’t have some tool both collecting the information, reporting the information, and a human being monitoring information.
Reg: That’s so important because intelligence—not just artificial intelligence, but human intelligence itself—is so essential to this practice. Go ahead.
John: Well there are some good things that AI can help you with. AI is a great invention of the ability to monitor something and take an immediate action faster than a human, absolutely, but at the end of the day a human being has to make the determination. AI can maybe cut you off, can turn off the port, can shut off the terminal, can do something based upon an action if somebody told it to, but at the after actions part of why did that person do it? What was the collective reason for it? Was it a person inside? Was it by accident? Those are all things that take human characteristics to go evaluate the information and see if it’s—you know, why was it there? And so at the end of the day some human has to be responsible. AI can trip the trigger and go take an action to prevent further loss, but the after actions typically has to still be a human at this point.
Reg: So I mean I know you guys have some outstanding tools for doing this, and I want you to tell us about them at some point. But before you do, I want to ask you a little bit about what’s the process of properly configuring a security or auditing or even a technical person to be one of these people whose experience and intelligence interacts effectively with such tools?
John: Well that’s a really good question. Actually there are some good places to do that. As a matter of fact, I’m going to put a plug in for our conference just to let you know [laughs], but that’s actually a good example. You have to be diligent in this. You have to attend and learn constantly, and Vanguard is one of the security providers that actually teaches these different things—not only a z/OS level or an RACF, which we’ve been known for for close to 40 years now. We’re RACF experts and we teach people how to think about how RACF should process, think about what the rules are, think about what the exposures are, and then give them real world examples. At the conference, we brought hackers in there and showed them how this is done. We’ve done real world examples on the fly. We’ve had open contests with people at these conferences. Hey, here’s a system. Hack it. Let’s show everybody how to do it. So you have to stay vigilant to do that and you have to attend ISACA conferences if you’re an auditor.
John: You have to understand the auditing principles as they change. You’ve got to go to security conferences with IBM, with Vanguard—with CA if you’re in that industry. But whatever your industry principles are, those are the folks who’ve got to learn every day. If you’re not learning every day in this industry, you’re not staying current in this industry. It’s just that simple.
Milt: You know you bring that up. One of the things that I’m finding because of my youth [laughs] is that recently I was working with one of our clients, and they just got a new CSO in who had never been involved with the mainframe at all.
Reg: Oh boy.
Milt: Well-educated, lots of background, smart guy and he was concerned about privilege access management and monitoring. He looked at some tools just for privilege access, and they were getting ready to place an order with us and he called us and he says you know, I’m a little concerned why I have to get the stuff for the z/OS machine when I can get all this other stuff for the open systems. They’re going to take care of my privilege access. It took us awhile. I think we were on the phone with him for like 2 1/2 hours and we explained how the z/OS platform works. And one of the things that he really really liked was the ability to get real time reporting through our active alerts—other companies have things—and get them out to their SIM. They happen to be a Splunk user and we could get that information out to him, get it in easy-to-use screens that were—
John: Dashboards, right?
Milt: Dashboards that have the graphs and all the other kind of stuff that executives like to look at it. But it was just an interesting—because he was not a mainframe person, he had a real hard time grasping the power of the z/OS platform.
John: You just said a sim, and we’ll use Splunk as an example because you brought that one up. Splunk can consume like we talked about earlier, the SMF records. Those are millions on millions of records every day.
John: They’re not specific to a security event, to an access management, to an elevated privileges. And the difference is taking—do you have this big bucket of all these events and you expect somebody to build filters and tools and dashboards and everything to relate to them, or do you buy an independent tool that narrows down the focus and says this is an event, this is a security event, this is a privilege access security event, and it builds upon those and then triggers a specific dashboard. And that’s where we come in at the end of the day.
John: Splunk is an excellent tool, but you’ve got to have the add-on feature that says I’m going to narrow it down to on my z/OS platform for my banking applications—you know phase or change by an authorized user in this area of the data that I care about, now trigger an event. So that’s how you get down to where the intelligence in the machine/in the application can trigger a human to go do something.
Reg: Hmm. Now you’d assume there’s also some activities that can be triggered automatically perhaps using intelligence, but that can happen on platform in real time. What role do you play in that?
John: Well there’s kind of two ways to do that, if you think about it. You can be preventive, proactive if you will—
John: Or you could be reactive. And we have both type of programs in there. We have a thing called Enforcer for instance that will actually look for certain security events and changes in the database, and if those changes happen, roll those changes back. That’s kind of a reactive mechanism, right? See something happen—oops that’s not supposed to happen, let’s change it back. We also have a tool that Milt likes to tout about a lot recently because he’s had a lot more success with it, which is proactive—that’s our policy manager. So one of the things you’ve got to think about when you talk about privilege access management is the ability to actually put in proactive tools that says even though your elevated privilege is allowed, I only want you to do a subset of that. So in other words in RACF—we’ll use that one as an example—you have a thing called system special. Well system special gives you the ability to do everything on the platform. It’s like the oh my God, I can do everything in the world. We have a tool that will come in and you should do things like this and that proactively say oh, wait a minute—no. I know you’re allowed to do everything, I’m going to take away everything that’s not within your job. If you’re not allowed to do data sets, I can remove the ability to do data sets and let you just manage users. If you’re not allowed to do general resources, I’m going to prevent those. If there are certain general resource of data profiles that are highly privileged and I don’t want junior system programmers or security guys to manage, only senior, I can put in proactive controls with a thing called policy manager that prevents you from executing those commands, even though you have that large scale, system-wide universal access from a privilege access point, I can control that and proactively prevent that.
Reg: I’m going to guess a big part of that process is sort of the discovery process of identifying when you’ve got somebody who has been there for 40 years and has collected access throughout their career to find out which accesses are actually still needed for their current role, and then get them to let go of the rest.
John: Well that actually leads you into kind of a different product or project that we have called Clean Up—
John: Because what you just described happens in every—I don’t care who you are and even in my role at the company, I probably have not lost the access I need as I’ve elevated myself through the company, right? I still have access to everything that I need and that’s what happens over 40 years.
John: I had this privilege or that privilege, but do I need or use them. And what you have to do is you’ve got to combine multiple things. You have to monitor over time to see if I actually use those, and then remove those that I have not used. We actually recommend specifically—look at things for 400 days. So you’ve gone through your entire business life cycle for a year—your end of year, your close out, your quarters, and all that—and then you can produce reports, what’s called the access that has happened, okay? And we compare that to the access that’s available and say, wait a minute. You’ve got all this accesses you’ve accumulated but you’ve only used 60% of that. What do we do about this other 40%? Should we remove it? Then we give you the ability to use that on a z/OS platform and say okay, let’s compare that if I make these changes, what’s going to happen to your access? So you know the results before that happens. That’s called Offline, actually. It’s a simulation product we have.
Milt: Now, John—
Reg: Can you archive their access? Sorry, go ahead.
John: Actually yes. We actually give you the commands to remove the access, and we give you the commands to replace the access back that you took away. That’s a funny story because we did that because guess what? I can guarantee Friday night you’re going to make that move and Saturday morning, the CEO is going to be on the phone going oh my God, you changed this and we have to have it. So yes, everything we do has the ability to put it back in and back out.
Milt: So John, you mentioned—I’m sorry. I didn’t mean to interrupt you there—but we had a client that was working with us because they had access from people that were no longer there, and that particular access was called surrogate class in RACF.
John: Oh yeah.
Milt: It was a huge audit finding for them, and they spent years getting that thing cleaned up. They didn’t do in one day. They collected for a year and then they started the next year after and after. They had some—I don’t know the number. It was huge, a couple hundred thousand over some period of time because they hadn’t made any cleanup in 30 some odd years. And that’s another example of somebody that had a nefarious attitude and wanted to do something wrong. They can grab a surrogate class information from somebody that’s no longer there, put a job on the system, do certain things and then get rid of the fingerprints because you’re going to keep rolling that thing forward.
John: For of those who don’t know, surrogate access is actually indirect access. In other words, I don’t need Milt’s identity. I can actually submit a job as Milt if I’m authorized to use Milt’s identify, so it’s called indirect surrogate access.
Reg: It’s UNIX SU.
John: Similar, yes. You can do an SU without a password at the end of the day on UNIX. This is identically the same thing.
John: If I’m authorized or if I’m a pseudo user as one or there are many different ways to do that on UNIX and z/OS. And if you have indirect access, are you monitoring the use of that indirect access? In circuit class it can level after level after level, and that becomes a very hard cleanup, so think about it from a UNIX aspect, right? I can pseudo to John, I can pseudo to Bob, and I can just continue that process, jumping from user ID to user ID to user ID as long as I know that. The big difference in the RACF world is that there is no password required.
Milt: Right, so that’s a big thing you need to be—we talk about privileged users all the time. Effectively with that capability, you’ve created a privileged user.
John: And you’ve created a privileged user chain, and you’ve got to look at every link in that chain.
Reg: Wow, because if you have access as a surrogate to somebody with low authority but other surrogates have access to people with higher authority, that would be a real problem in that chain.
John: Yeah. If you’re a low-level user who just logs in and you can submit as a high-level user, you’ve got the keys to the kingdom.
Reg: Yeah, and that’s really important. You know one of the things that wasn’t that much the case except for the advanced user under the mainframe for a long time was the idea that there were sort of little hidden doors that you could go through in order to substantially increase your access. And although IBM’s statement of integrity says you don’t have to have it that way, people sort of allowed it to be that way because they were too busy. I’m going to guess that there are some other things like that that you’ve encountered as well. You are referring, for example, to Db2 data. What sort of things—because Db2 is just an absolutely complex environment that serves up a really simple model. What sort of things have you encountered under Db2 that have that similar indirection that you have to deal with?
John: Well Db2 is even more complex, like you said, depending upon whether you’ve got secondary authorizations, whether you have group authentication authorizations. So again you’re talking about multiple levels of security integrity, so IBM’s statement—and I like the way you brought that up. The integrity statement says that—and they’re very good at it. They say the system has integrity if you configure it properly, and it does. If you configure it properly, I can guarantee you that this platform or any platform done securely is good, but it’s that misdirection that happens. It’s that you forgot this step. You added that surrogate. You added a group connect to a secondary authorization in a Db2 table that you didn’t realize gave them access to a system table, which gave them access to be able to grant themselves authority throughout the system. So again those are chains, because you chain from an RACF group or a secondary auth group in ACF2 into a table that’s within the system authorization in Db2 to a system table that gives you the ability to do a grant authority in there. So again, those are links in a chain, but it takes all of them, and specifically takes all the knowledge to use those.
Milt: Db2—we get a fair number of requests per year to work with our clients or prospects to take the Db2 and put it under the control of the external control manager RACF, and it’s not a trivial job. It’s not a trivial job.
John: And when you talk about Db2, that’s one of the caveats you have to be careful with. Because Db2 has internal security that he is talking about. It has external security, and then it has a combination of both security—
Milt: Yeah, yeah.
John: So and it has fall through security. So again the knowledge of what you’re laying down and how it can be exploited becomes very important as to what you’re doing—
Reg: That cascade event.
John: Oh yeah. Most security cascades, and people don’t realize that. When you’re going to do—especially like off-platform migration, and you talked about LDAP, which is a good one, right? You link that data out externally. What can I do to exploit that? Now I don’t need to know the date of the platform I’m on, but the date of the platform I’m going to go to. It doesn’t matter. If I can exploit from an active directory system that has that data connected to it, it doesn’t matter. It definitely cascades at that point.
Reg: Now of course on the one hand we’re talking about the users here and their relationship with each other and chaining each other, but then there’s the data itself, and I gather that just being able to properly characterize and identify and monitor critical data is a whole discipline that you guys are engaged with.
John: Well critical data—every company is different. Let’s be serious, right? All companies are—and you call it critical data, and it’s literally what do you define as critical to your business? If your business is banking, your critical data is probably the money. So what’s protecting that money? Well, it's that money nowadays is all electronic at the end of the day. What data sets are in there? What information for account numbers is in there? So now you have to—again you’re cascading different things, right? So I know that the account numbers are in that database. That database consists of a bunch of files. Those files are being protected by a security manager. They have internal security for Db2 that lets me do selects. They have external security that lets me copy those data sets. All of those things have to be monitored at the same time. That means you have to have a very complex system that not only monitors the use of the data, the type of data, the tables that are in it, the actual privileged user—what do they do day to day and what does the system allow to be exfiltrated off and on that system? So it’s a very complex thing. Now yes, we do have many tools that look at the ability for the person to make those changes, the accesses that they requested, the data sets they utilize, the general resources that they use to grant themselves access to that data. So there’s many different ways to do that, but all of those have to be monitored together to give you that holistic view of what’s happening.
Reg: I’m going to guess SMF by itself is only one dimension of that.
John: Um, really actually I think SMF is not one-dimensional but requires—like we talked about the integrity statement, right? It requires a human to actually set it up to do something. So SMF—most systems in the world out there are not going to do what’s called record access of success. It’s just too much information. So by default, it’s only set for violations or bad things that happen. Okay, so you have to take some sort of action that—because remember, now we’re talking about privileged access, right? I have access to that data, I want to monitor that access. So I’m already in that successful condition because I’m an authorized user to that information. So what I really need to know is is an authorized user using that data in a way that it shouldn’t be used, and that’s the real question, right? The question is do you as an authorized user performing your job—did we record the information to make sure that you did your job when you’re supposed to do your job at the time, you’re authorized to do your job. Those are all key things that you’ve got to look at when you talk about privilege access monitoring. You have to monitor all of them and then throw away the things that you don’t need because they were good and then only event trigger and deep dive into the things that matter to you, but you’ve got to collect it all before you can make that decision.
Milt: John, I was at a client—one of our clients—and ran into the CSO. As you know we attend the CSO events. We haven’t for the last couple of years because of the Covid thing, but prior to that we attended them. And when I was on site, he said—and they’re in the insurance sector—he said I’m having a problem with sometimes every once in a while, having people logging into the system at odd hours. We need to make sure we know exactly what they’re doing during those odd hours. They had a whole team assembled because they were very concerned about the PII data and other information that people would grab and get into the wild, or they were selling it. It could have been about a movie star or somebody famous, whatever it happens to be, they want to get the information. But these people are logging on, this one particular individual was logging on and they couldn’t find out exactly what they were looking at. Their normal work hours were not 2:00 and 3:00 in the morning, so those are the kind of triggers that know what do they look at? How do they get to it? What files are they into?
John: Well that’s where we come into play on some of that. So some of that today we exist in the monitor every event and then eventually the trigger comes in like you said
John: Okay and you have that information because they accessed that. But at some point—and this is in the future, this is not today. At some point in the future—and we even see this—we’re going to combine an AI in a security system into an integral part to do that monitoring at the same time.
John: So it could record over time the ability for you to know what you just said, and that’s the ability for you to look at Milt as a human being, as an activity on a system and not just on z/OS but across the universe. Milt logged into the VPN. Milt logged into z/OS. Milt logged into the ATM. Milt logged—and you have lots of large data that do that. You want to combine an AI infrastructure with that infrastructure of the data that’s traveling across the universe, and then say okay, wait a minute. Milt normally only comes in from 8:00 to 5:00. Why is he coming in at 2:00 in the morning? That’s the future of real security is when you have AI and security as an autonomous system to notify the humans: hey Milt wasn’t supposed to be here tonight. Now what do you want me to do?
Reg: Well since we’re talking about the future—and I know it’s important to distinguish between what you’ve already had, which is a really impressive tool set, and where you’re planning to go in the short-term, and what your vision is for the long-term. Well maybe you can kind of do that—just kind of you know, paint a picture. What can we do today that’s going to make this difference you’re talking about, and then how do you see the future? What are the challenges? I know security is not a journey with an end to it, but what are things over the next, say, decade or more that are going to be needed to respond to, and the kind of technologies and other responses that are going to be part of that building on what you have already?
John: You crack me up with that statement, because you said security is a—how did you put it? A road well-traveled.
Reg: A journey, not a destination, I guess you could say.
John: A journey, not a destination. Because I can recall about 40, almost 45 years ago, me and my father having a conversation about me getting into this field and him saying this was a dying field. I’m like nah, this is where I’m going to make a career out of, and it was in crypto specifically, okay? And it was funny because he was a cryptographer years ago—but you crack me up with that, because you’re right. This is a journey, absolutely a journey and we currently as a company, we have many different product lines that deal specifically with security, whether you’re talking about day-to-day operational efficiencies, whether you’re talking about privilege access management. Those are things that allow you to record where you are today, which includes what access do you have, what privileges do you have, what things you’re allowed to do. Who is the business owner? Can we report back to those? Those are operational things and we have many different things. You talk about the things, the after actions. We have also things like SMF reporting. We do total SMF reporting where you can have a human sit down and parse through many, many different types of things: who did what, when did they do it, what was violations, how were those things. But then you also have automated responses like Enforcer and active alerts—
John: Which gives you the ability to trigger in on finite details. But then we’ve also migrated into—Milt brought up the dashboards, which is really the way the world has gone too. Which is, I need an automated system to give me that red/green/yellow dashboard that tells me these events are happening, and from an executive point of view. Okay I don’t care about the green ones this week. I want to see what’s happening in these other ones and I can escalate those, but then we also go into other realms of the future, which is the proactive and preventative parts, right? Not only can we monitor them but hopefully we will get to the point where we can monitor these actions, prevent these actions, and then maybe actually lock these systems down through some sort of intelligence to make them automatically adapt to these kind of security requirements. That’s where I see the future going.
Milt: I’ll make a comment here on what you were talking about under the second item in efficiency. We see a big push with our clients, and we have three or four active programs we’re working on right now to help large clients improve their operational efficiency in the reporting of—collecting the data and reporting it in a consolidated format like you were talking about—in easy to read reports, speeding up the process. So we use our aggregation delivery capability, helping them put fixes and maintenance and all the other kind of stuff on their systems in a very rapid way, helping them understand what date things are on, make sure all the LPARs are operating like they should. We’re being forced into this with our clients by request. This is them coming to us to say we really need to take a peek at all of these things because we had five or six people working on this. We just got notice that three are going to retire, and now we’re down to two. These three did these jobs. Can you do this for us? I think the big push in the z/OS market where Vanguard is going to be to help our customers be more operationally efficient and give them the kind of output and information they need to run their business from a business perspective. Just like the simple request we had with the bank that before you can go ahead and four people in the bank and have the same password. Now they want to us to help where each one of them do it, and we have some tools that can help elevate that process while they’re in there. I think that’s the kind of marrying that we’re seeing when we’re taking 30 some odd years of code, a vault of information, really smart developers. We develop all of our own stuff. How can we get in a room, gather all this stuff up and help a customer achieve operational efficiency, better security, good reporting, answer to the CSO, pass the audits? I think that’s really how we’re being pressed as a company to help them run their business better.
John: I think you bring up the CSOs, which is a really good point. CSOs 30 years ago were a different breed than they are today. They grew up in—their mainframe was there, if you will. Prior to about ’92-’93 was the mainstay of all data.
John: It really was but over that time since then, CSOs have grown up in a different environment: UNIX, Windows, Linux, very homogenous systems that are out there—
Reg: Cloud stuff.
John: Cloud stuff, all kinds of things. So they may not be exposed to this, but they’re exposed to the same principle—and you brought up VAD, which is a good idea to talk about a maintenance resilience that doesn’t exist on a mainframe. In an open system, in a cloud environment, you have the ability to do nightly maintenance. You have automated push technologies. z/OS was done with 2 SMP/E as a type of manual interaction, okay? There are some automated tools that would go on there, but you’re going to find that in the future you’re going to see tools like VAD that help operationally push out a maintenance patch overnight without the humans, just like a Windows or a cloud upgrade. So we’re going to be producing more and more things into that realm for operational efficiencies—
Reg: Yeah. Yup.
John: And put them in a way that a newer CSO who’s been brought up in the cloud, in the UNIX, in the Windows. We can say look, that’s just an SIEM for the mainframe at the end of the day. It’s an automated system to push out that maintenance for you, and here’s how you do it.
Milt: And make sure it’s applied.
John: And make sure it’s applied.
Reg: That’s so important. You know we forget that a denial of service attack is a direct contravention of security, and therefore security is about availability. And you don’t have availability if you don’t have the bandwidth to get things done.
John: That’s true and the bandwidth not to do things has actually caused many a security breach. And if you dig down into them, it’s because the patch wasn’t done. Well, why wasn’t the patch done? Well, because they were doing other things. They didn’t have time. Well, why didn’t they have time? Well we went from ten—three people retired. I got seven. Three more retired, and I’ve got four. I haven’t back-filled those. I can’t expect three guys to do the work of ten, even on the mainframe.
Milt: Yup. We have a client that we’re working with—it’s an international government client. They want to go ahead and move forward to help put better security on their system, but come to find out they’re on—what release are they on?
John: Oh, 2.2, today. Yeah.
Milt: 2.2 today, and they’re getting whacked in the head by the auditors: get this system up and running. Well come to find out that the people that were working on that system were government employees that have left.
John: And like he said, the system level, the operating system level that they were talking about had gone out of support four years ago, and yet it was still running because operationally, it didn’t die.
Milt: Yup [laughs].
Reg: That is a big issue. Are there any other really big issues you want to identify before we wind up today?
John: No. I appreciate your time and talking, but like you said, I think at the end of the day, Milt, our job is to talk about security to many, many different people. We talk on a platform that’s very large, and you’ve alluded to the complexity of those. The biggest thing that Vanguard—its founder it said from day one, Ron’s big part was this company was about the knowledge of the system—
John: And what we bring is the knowledge of those systems, and tie tools with knowledge, so that you don’t have to have as much of that. That’s what we bring to the table.
Milt: One thing kind of nice—not nice but it’s a real quality of what we deliver. We build it here, we test it here, we run it here, and it works. That’s a nice feeling when you’re pulling into the parking lot, to know that we’re helping our clients solve real business and security problems. We get many requests to do special things for them to help with their business, and we’re able—because of our size and our agility—we’re able to turn on a dime and help them solve their problems.
John: And that’s what security really is. You solve one problem at a time. Hopefully, you get them all solved [laughs].
Reg: Awesome. Well, the conversation has gone long and short at the same time. It feels like we just started. This has been absolutely fascinating. Milt and John, it’s been a real pleasure. Thank you so much.
John: Thank you.
Milt: Thank you very much.
Reg: I’ll be back with another podcast next month, but in the meantime check out the other content on TechChannel. You can also subscribe to their weekly newsletters, webinars, e-books, Solutions Directory and more on the subscription page. I’m Reg Harbeck.
z/OS / Linux on IBM Z / z/VM / z/VSE / Podcast / Community / Security / Data security / TechTalk Enterprise
About the author