John Dominic on Cloud System Benefits and Trends
John Dominic: Hey Charlie, glad to be here.
Charlie: Thanks. John, I do want to start our conversation with what I just said about cloud because it’s a bit of elusive term in that it has so many different definitions depending on who I'm speaking with, so why don’t you start and that's a very good jumping off point. Why don't we start with that? What is cloud from where you sit as you see it?
John: Yeah, that's a good question and it is really tricky. I think every conversation we have cloud means what it means to a particular partner or clients so, as an example, from a HA standpoint, we obviously get involved pretty heavily with disaster recovery, you know, elements of the cloud or data backup. We get involved with migrating people to cloud whether it's a hybrid cloud or multi in a cloud environment, but the reality is it can mean so many things to so many different people. Infrastructure as a service gives them a landing platform. We have customers that use it as more of a turnkey platform as a service; customers use just for development partition as they spin things up for a special project so it's a challenge. I mean, it basically computes nets available as turnkey as you want, but as you say, most conversations really depend on what angle is going to be relevant to a particular project and I would say that hybrid cloud, which involves some of these other things, is probably the most predominant in the conversations that we have.
Charlie: And I think that's another point of confusion right there because I think some customers or some people think hybrid cloud that—what does that actually mean? Does it mean I'm doing some of my computing on prem and some computing in a cloud or am I just using services that are available on the cloud? What's would be a proper definition of a hybrid cloud implementation?
John: Well, see that's a great point here. I think it's hard to really put down that definition. I would consider it be—hybrid cloud would be someone that is running a premise-based operation usually from what wee like their production or source type system and then they are leveraging the cloud partner in some fashion to provide a secondary landing and they'll do it for different reasons. They might want to have a system that is available at a further late point, so not on the same grid. They might want to have extra hands available in the event of a disaster. That's what we see particularly with COVID, you know, they want to make that look if there is like a major outbreak locally, they might have other people available. So, I would see hybrid cloud as being just an extension of having production on premise and then going out the arm. Now, that comes in all kinds of flavors and directions. We have customers that use localized systems for high availability and DR, but they might replicate out to an additional remote system, which can also be considered hybrid cloud. So I think at the core of it, it has to do with the fact that part of the workload, usually production, is going to still be based within the company, but the other platforms or applications might be hosted on a remote system.
Charlie: The remote system being the cloud system.
John: The cloud system.
Charlie: Even within the systems in the cloud, then there is a sub discussion, I suppose, and that is any public cloud or a private cloud? I mean, by pure definition of those two terms, I can kind of discern what that means, but why don't you just elaborate on that. What is a pure private cloud to me?
John: I think to most customers a private cloud means that they're essentially getting a dedicated partition that is being hosted by a partner on you know one of the cloud services that it is purely for them so the security parameters nobody else can get onto that box on and they're probably paying a premium to have that luxury. Whereas some of the public clouds, they might share information, so they might have an application that might be hosted on one of these cloud systems that multiple clients can get to or maybe they have a shared environment so there's one partition, but the partner is running multiple customer workloads on there. I have seen that plenty of times as well and usually that fits into the mold of maybe like more of an application provider to say, “OK, well these customers are all within a particular application bundle.” Depending on what their security and insurance requirements are, they might want to decide whether they need the hardened security and dedication and privacy of a dedicated system vs. a share system.
Charlie: John, what are seeing out there right now as far as momentum. Do you see a bigger push right now or a bigger migration to a cloud, you know, be it hybrid or a full cloud? Do you see a bigger push towards that right now?
John: Yeah, I would say absolutely. There has been a huge ramp up in cloud over the last few years and it comes in like different configurations like I said before. It might not be—we have customers that originally would have been completely prem-based. They're now starting to get into the hybrid cloud. We have a number of our partners that provide that as a service, which just use going up and up. I think that those two things kind of go together. I think because there is such a thing as a resource challenge of expertise on a platform that that the general IBM i admin or user needs that additional support and one of the easiest places to get it is by having service provider. Well, if the service provider is providing DR and application, if they can include the management of the hardware as well, it just means it that much easier and one less thing that the local admin who is already time squeezed has to do, so I think part of the reason that cloud is just getting more and more relevant is not the traditional regional separation. I think COVID has had a lot to do with it. I think the ability to have extra hands and having people work from home and getting the systems are disparate is important, but I think that a huge driver is basically offloading a lot of the workload and the time that goes into buying and managing additional system. So, it absolutely a humungous uptick year upon year on what we would consider, like hybrid cloud or fully hosted environments.
Charlie: So those are all very interesting points. I find when you said about reasons, you know, good metrics on why I would want to consider a cloud solution in any form be it hybrid or full—you know, fully hosted whatever the case is, but what are some other metrics or some decision points that I typically would go through if I was looking at algorithm to decide if I want to make this move or take this step. What are those decisions or what benefits can you see that somebody would want to go down this path?
John: Yeah, I think the main drivers are going to be in different camps. A lot of IBM clients, particularly those who like the SMV's, I think are coming to the cloud for the first time as an extension for the resiliency. You know, again, COVID and all this stuff is really driving that, and I think for those customers, there is particular areas that they're looking at. #1. A lot of this is driven by risk and insurance. We're seeing a large number of customers who are really trying to leverage cloud for the purposes of meeting insurance requirements. What I mean by that is that there's definitely like a tightening up within the industry to say—to get the best premium coverage and the best premium rates, you really need to have a continuity or resiliency plan that's active and that means if something happens to your primer production, can you be running live as quickly as possible. So, while traditional you might just look at downtime as being operational thing, what I'm finding is there's a lot of like risk management teams within our client base and our partner base who are looking at trying to lower that premium and part of the reason they're doing that is because premiums don't cover—your insurance does not cover all of your losses. We see lots of things out there like you know from Gartner reports and Forrester and whatnot that the coverage of the losses sometimes around maybe 40%, so the companies look at it and say, “OK, I have two victories here. Number one, I can get a better premium for better coverage, but I have to do that because I know I'm not that cover the gap if we have an incident, and everybody has incidences.”
There's nobody that doesn't have an incident at some point, so that's kind of a big driver I think for many of the particularly the SMV customers now in the wake of COVID in the last couple of years. They're looking at spreading out that risk trying to get the best coverage possible in case there is an outage or an unplanned program, but the other thing I think we would see is we have a lot of vendors that are trying to extend to different territories. What I mean by that, is that, you know, there's a big population of customers that I'll use the Caribbean as our classic example. They run a big source machine on the island. They'll have an active target that they can flip between the two; however, the problem is if a storm comes in, obviously both systems go down if electrical is down. Now eight years ago, that wasn't a big problem because when the island was down, there was no business to be done, but now you're starting to see with like banks and insurance companies they're doing business on other islands as well so while company might be in Barbados. As an example, you know, when Barbados was down and a local customer couldn't go into retail shop, it was a problem, but now that you're seeing business coming across to Canada and the States and other Caribbean islands those businesses have to be up, so they look a cloud now as like a worst-case scenario option. We would call that like a multi-node environment and the goal would not only would they have their local DR and availability going on island, but they would also have an additional system sitting some place else, the IBM cloud, an Azure cloud or a partner cloud. In the event where the local island was down, they still have the ability to run at a secondary location. Again, COVID also has a factor when we get into some of these lock downs like we're seeing really hard in the Asia Pacific, you still need some way to be able to manage the systems and have it running remotely. So having that regional distance fit in new ways that it didn't before, I think is really driving what people are looking at when they're looking at leveraging that cloud. It solves multiple problems.
Charlie: You know, John, you brought up a lot of information in that reply and I want to break that down to two different things because this is another area of maybe not confusion, but certainly an area where people use these two terms interchangeably and that's really HA, high availability and DR, disaster recovery. Anytime I have a conversation with anybody, these terms are just kicked around as if it's one thing but those are two distinct disciplines or two 2 distinct events, right? HA is—well you tell me. You know, how do you view HA vs. DR and also why do you think that there is a confusion that people use them interchangeably?
John: I know. We see this all the time. I think everyone sees HA as falling under DR as a broad scope, but to me the difference is when we're talking about HA or high availability, that just means that the data is continuously available. So if you have an outage, your data is available and accessible immediately elsewhere, whereas DR would be like the traditional technology of like a tape or maybe even a vault although vault is kind of in the middle in my opinion. A tape might be, “Look, we have to restore some data.” It's going to take a few hours before or maybe a few days before the system is available and that's the difference. To your point, Charlie, that kind of goes back to the insurance thing. What we're finding is that traditionally just having "DR" was enough to get to you know decent coverage for these types of things, but now with business interruption insurance and particularly like the threat of ransomware and all this stuff, having continuously available data in applications is critical and that is firmly in the HA camp as far as we're concerned and in that again might be an unplanned outage. DR is more, I would say, like a traditional restore; this is about having information available all the time. Even as an example when customers are migrating between systems, most of them do like a migrate while actor or migrate live and that just means that, while they're bringing in the new system and getting up to speed, they want to keep availability going until that cut over point. It's not just a—in 10 years ago, you just would have saved the tape, put it on a new box, got everything going over a weekend. That just doesn't happen much anymore. Customer need be available full time and the risk in insurance is usually the main driver.
Charlie: You know, I can tell just from our customers once you have another partition in this case in the cloud, that machine is capable of doing so much more than just sitting idly collecting data all day long. We are able to use that cloud system, for example, to do analysis on the data. You know we're querying that database vs. the production database. We find that it makes—it's a better use of resources by using that data because it's as current as production, yet it gives us an opportunity to repurpose that data at the same time. What else are you seeing? Any other ways that the data might be useful to me in a shop other than just being there to be on the ready to be our next production machine in the event-in the event of an accident or a problem I should say.
John: So that's a good example. We have a number of customers that essentially run DI off the target because they obviously don't want to impact the production workload and some of those tools can be real hogs, so that's exactly what they do. Because you have it real time environment on this target system that's continuously available. Not only could you run those types of tools across there, but you can do other things as well like you could run backups off the target as well because the target is obviously up to date and that shrinks the window that a system is traditionally unavailable during night when you're doing a save so it opens up the opportunity to have a more 24 by seven availability to the application than you would if you just had a primary source system. I was going to say running DI tools off the target system is definitely a growing. I see lots of partners getting into that business as like an expansion of trying to reduce the workload because, as you say, the backup system is going to much quieter. It makes way more sense to take up CPU on that then it does on production to run these types of analysis tools but doing it off peak.
Charlie: So, there's another interesting paradigm that we need to talk about and that is the idea of a role swap. I know this is not a one and done type situation. You typically don't go to—you know, you don't set up this whole environment and then forget about it. You need to test the vitality of this system on some kind schedule. What are you seeing as far as that's concerned? What would you recommend as a good schedule or does it depend on the industry or you know inherent regulations? What do you see as far as that for doing role swamp, things like that?
John: You know, that's a good question. I think a lot of customers that come to us, the main reason they come is because they have tried HA in the past, but they use it more like a disaster recovery and by that I mean they don't switch environments and they haven't tested it, but the point of having something continuously available and this is where things start to get back into that insurance realm again. I think traditionally it was a ticking the box to say, “Look, we have a high availability software for this system in your application and from an insurance perspective they never checked to make sure it was in use.” Now we're starting to see that customers have to essentially provide a live DR exercise or test or requirement to meet those regulations, so we're seeing it more and more. Now, we have lots of customers that swap like quarterly like they continue to just swap between systems quarter upon quarter. I think most strive to do at least one system change or at least a successful test a year as a minimum, but a lot of that is driven by internal risk too. I would say one other thing that adds into that, is we talk to a lot of customers and it just drives me bonkers. They say, “Yup, we had a DR test, and it was successful. We spent a month getting ready for it.” Well the whole point of having something available in the event of a disaster is that it is available without seeing it up for four months to be successful, so from our company standpoint, you need to be able to provide something that if it is highly available, it has all the bells and whistles to make sure that it can be highly available at any time not when you gone through the trouble of making sure you have a successful test. I think that auditors are getting very wise to that scenario and that is driving again back to that cloud business. I think seeing the resource it takes to make that successful sort of makes the case to have a cloud specialist or disaster recovery as a service specialist come and provide that on top.
Charlie: Right, certainly and of course the famous scene that I'm well aware of is if you don't test your backup strategy, you don't have a backup strategy.
John: Yup, yeah. I mean, it’s a lot of these things, you would say 10 years ago, you know having tape by itself right, the traditional backup was good enough and it was hardware. It wasn't a problem. You could really leverage off that and it still works. I mean, most customers still use that to do little restores and whatever. That hasn't become completely obsolete I think by any means, but the business driver is definitely more about the application availability in an event, and that is what—you know, if somebody is coming to us that is usually why. They have to meet these requirements. They want to get the good premiums. They're trying to reduce that any outage window as much as possible because they've already calculated out what the potential loss is. They're just trying to keep it to the absolute minimum.
Charlie: And certainly, you said it also. Application uptime, you know it's a changed world. You know, we're on the web now. Customers— doors don't close, you know, 9 to 5. There's no set hours for business anymore, so we need to have the application up as much as possible.
John: And the customer is getting on social media to talk about how long the system and service has been unavailable is a brand killer, so you know it is really on the bigger companies that have that uptime because they're touching customers all around the world now and those customers have a very strong voice when a service is not available.
Charlie: John, one thing we didn't touch on and I guess maybe it's just assumed or it's baked into the equation and that is security. I know that's a big point that people have some reservations about how secure is my date in the cloud and I've heard some people argue that it's even safer in the cloud than it is on prem.
John: Well, you know what? I mean, that could be a very good argument. I have seen—yeah. I have seen environments I'm sure we all have where having this through a hardened IBM datacenter of something like that or a disaster recovery specialist, obviously they're taking the measures that particularly a SMV is just not going to take to secure that business, so I really don't see that as a threat. I mean what business isn't already giving customers access to you know Wi-Fi networks and so forth. To your point, Charlie, there is kind of two classes of insurance that kind of come in here again and I know I'm hitting the insurance, but it's something that just keeps coming in every conversation in the last year and a half is that there is cybersecurity insurance and there's business interruption insurance; disaster recovery and high availability actually play into both. Sometimes it might not be completely logical from the cyber security event, but again insurance is all you are providing some proof that you're going to minimize the loss as much as possible. So even if it's a ransomware where you know it's not a simple failover as a solution, still having that to reduce the potential loss there is still a big factor, whether it's a direct factor or not. So I think that's why we're seeing much more of it because you know cyber security is becoming—just growing and growing and growing, but it means so many different things to different people, just like cloud. Like we were talking about earlier, that there are just so many factors to it and availability is part of it, right. You can have an event, but the event still ties back to whether your service is available or not so the things are—while I would say they are different, they are still inextricably linked together.
Charlie: You know, if I look at the road map, John, for cloud migration, the application developers are not off the hook here. This is not done in a vacuum. I mean there's needs to be some consideration to the application itself, I think, to have continuous availability for application. The way you have some batch processes for example, day end, you need to really examine those. What have you seen—what have you seen in that because some processes that were previously dedicated now you need to change them, so that they don't require a lockdown of the system. So what have you see in that regard?
John: Yeah I think there is a definitely like and I'll swing over to monitoring here. You know, from like an API and connectivity standpoint, I think there are lots of new things happening here. When we look at what would traditionally be monitoring like an availability, you're really looking at much more extended network you know for API, other pieces of the application, so having the ability to basically look and provide API's out so that you can interconnect things from monitoring platforms like service now back to things like we do in high availability. There is an openness to it, but I think you could say, “Well is that a risk?” Well, I don't know. On this platform where you need the expertise and resource available immediately, you have to have some your interchange and openness of ideas, so I really don't think that's a negative. I think it's actually a necessity.
Charlie: Absolutely. John I'll tell you what. We've been chatting for quite some time on this already and this, to me, is such a fascinating topic because it really could change how a company runs their IT infrastructure or it is—it could be a huge change and I think people need to consider this as an option certainly. I think this podcast and you sharing your knowledge is a very good starting point. There's a lot of good points that you brought up here. I do encourage people to you know read more about this. It's certainly worth looking at. So, I want to thank you, John, for your time here today. We've been talking holy cow almost 30 minutes and that was the fastest 30 minutes I've every spoken or I've spent I should say. I want to thank you very much for your time, John. Again, John is the global VP of Maxava in the US headquarters. John, thank you very much. It was a pleasure chatting with you as always.
John: Anytime Charlie, thank you.
Charlie: Terrific and for everybody else who listens to this podcast, please make sure you check out TechChannel and their other great offerings. They have lots of educational podcasts and webcasts and it's really worth your time visiting them so thank you very much. This is Charlie Guarino. Everybody it's been a pleasure and I'm looking forward to speaking with you again next month. Bye now.