Russ Teubner on the Power of Automation and Modernization
Reg Harbeck: Hi. I’m Reg Harbeck and today I’m here with Russ Teubner, cofounder and CEO of Hostbridge Technology. Russ, welcome. Tell us about yourself. How did you end up in the world of mainframe?
Russ Teubner: Hi Reg. Well, first of all, thank you very much for the opportunity to visit today. Wow, how did I end up in the land of the mainframe? It feels like a long and torturous story. You know I was in university in the late 70s and actually when I was attending university, I worked for the university’s computer center and became a mainframe IBM 360/370 operator. That was my part-time job to help put my way through school. Then after I graduated, they liked me/I liked them, and that was my first job working in this data center. It was actually a golden era for learning that sort of technology because in the late 70s and the early 80s, things were exploding, right? Campuses were exploding with personal computer technology and frankly in that era in the name of academic freedom no one could tell anyone else what to buy or not buy, and so universities were awash in the need to integrate technologies. Of course the networking capabilities and architectures of that era were just so primitive compared to what we have today, but my job on behalf of that university was to make everything talk, make everything work. That’s really where I cut my teeth, not only in the world of all things mainframe, but also in the area of integration too, and with mainframe-based systems, networks, and applications. So you know actually by the early to mid 80s, I had an idea for my first software product and company. I started that, ran that for 20 years, ultimately sold that to a much larger French-based software company. And then after doing a tour of duty with them in an executive position, I decided it was time to get back to my entrepreneurial roots and I and Scott Glenn, the other Hostbridge cofounder, started Hostbridge Technology in order to pioneer some new areas of integration for mainframe applications.
Reg: Interesting. So now that would have been what? Late 80s, early 90s?
Russ: Well, when Hostbridge started or—?
Reg: Yeah, when Hostbridge started, yeah.
Russ: We started Hostbridge right at the end of the 90s, really right around 2000, so yeah.
Reg: Cool.
Russ: That was an era—
Reg: You were looking for a solution.
Russ: Yeah, and really there was kind of a personal—two things came together. One was kind of a personal vector and that was that I had served a couple of years. I had fulfilled my obligation to the acquire of my prior company—
Reg: Sure.
Russ: But at the same time IBM was coming out with a new version of CICS, and this was Version 1 of what they were going to call CICS Transaction Server, and Transaction Server was going to have some new capabilities. There were a couple of things under the covers of Transaction Server that really, really interested me, some of the things they were doing. Particularly they were preparing to offer an API under the covers of CICS that for the first time ever would allow a programmer to be able to programmatically interact with a screen-oriented transaction without any reference to rows and columns—in other words without doing any screen scraping. And so that’s really intriguing because the last product that we created at my first software company, Teubner and Associates, was for lack of a better term one of the world’s biggest, baddest screen scrapers on planet Earth. So the late 90s was a time where we were actively exploring you know, how in that company and with that last product, how to do integration to the mainframe using screen scraping and those sorts of technologies. Of course the answer is there’s just no good way to really achieve effective both performant and cost-effective integration using screen scraping as an integration technology. That was really something that we were trying to rectify with Hostbridge. We wanted to completely rethink it. We wanted to come up with a technology approach that would allow distributed components to be able—or things, whatever those things are in the outside world—
Reg: Sure.
Russ: To be able to interact with mainframe screen-oriented applications without screen scraping; in other words, without having any binding dependencies on rows and columns of where the data existed. Because our view then and has always been as soon as you build distributed applications or things out there outside the mainframe that rely upon this binding relationship to where a particular field is on the screen, you’re sunk, right?
Reg: Well regression testing becomes an incredible nightmare.
Russ: Absolutely and you enter a state that I somewhat, I guess, humorously call application rigor mortis. Because as soon as you’ve got things outside the mainframe—
Reg: Right.
Russ: Relying upon these fixed assumptions about that—you know ZIIP Code or part number occurs on row 3, column 5 for 12 spaces—the ability to mature, enhance, evolve that application is frozen. It’s pretty much gone.
Reg: So basically the distributed components become the legacy anchor, not the mainframe.
Russ: Right, absolutely. I mean I love the way you said that. They become the legacy anchor. I mean there are a lot of applications running on mainframes today that yes, should be modernized, ought to be modernized, can be modernized, but the reality is one of the reasons they haven’t been modernized at all or at least as aggressively as they could have been is simply because there are so many distributed components that are interacting with those applications, using these sorts of screen scraping techniques and thus have essentially interdicted the ability.
Reg: Right.
Russ: They have kind of just nipped it in the bud, the ability to really transform that application. And so these are the class of customers that we tend to work with and to help with our technology, whether it’s our integration technology or our analytics, so yeah.
Reg: Cool. Now one of the things one may immediately think of is—I mean I have a background in COBOL and CICS programming, among other things—is okay, well, if you’re not talking to the screen, are you talking to the BMS map? Are you talking to the linkage section? You know where in the transaction are you getting into to identify the data fields and name them in a consistent manner even when their place on the screen changes?
Russ: That’s a great question. Now we’re going to get kind of geeky and technical here, so—
Reg: I’ve always been like that.
Russ: I know but let’s do it anyway, right?
Reg: Okay.
Russ: What IBM did when they turned that corner into Transaction Server, kind of in the Transaction Server 1.1 or 2 timeframe, is they went back and they took a look at that—you mentioned BMS maps, right?—and what they did was that they changed under the covers the way those maps were generated. What they added was this section of metadata that actually described the screen. It described the fact that there is a field and that field is named—let’s make it easy—PART_NUMBER, right?
Reg: Okay.
Russ: And the metadata—
Reg: Good COBOL variable name.
Russ: That’s right, and what the metadata described was really kind of the data structure through which that COBOL application interacted with the screen. Now let me say that a little differently.
Reg: Okay.
Russ: You wrote COBOL programs back in the era and it still is going on with BMS or basic mapping support. The COBOL program itself does not care and does not know where a particular row or column a particular field occurs on. It just knows it by name. It moves the value into a field called PART_NUMBER. That’s all it does. The COBOL program doesn’t know. It’s actually—it in and of itself is insulated from that vaguery—
Reg: Right.
Russ: From that little detail. Well of course everyone in the outside world, if you’re interacting with that application through kind of a 3270 data stream, well then that’s your only hope is to interact with it based upon rows and columns. But what IBM did—and again, back in the Transaction Server Version 1 era—was they added the metadata to the map load module so that we in our code could go look at that and interact with that COBOL application on the basis of its data structure, not the screen format. So when you use our technology running on the mainframe under CICS—and frankly on a specialty engine, the zIIPzIIP processor if you have one. So it’s extremely performant and efficient—but when you use our technology to interact with that COBOL application that may have been written decades ago, you are interacting with it on the basis of those field names, on the basis of that data structure that was generated from BMS. So the way we completely circumvent the reliance on rows and columns is by leveraging what IBM added to the features that they added deep, deep under the covers of CICS in order to let a program interact with a COBOL application using this common data structure as opposed to—
Reg: So you don’t even have to change the code, the COBOL code in order to do this.
Russ: No, absolutely. The only thing we have to do is that the BMS map needs to have been recompiled somewhere since about 2000 or 2001, right? As long it has been recompiled or still could be recompiled, then by recompiling it, it will go ahead and create that bit of metadata in the BMS load module, and that is our reference point that allows us to—we use that metadata dynamically in order to be able to chat with the application based upon the data structure it’s using to both give and take field contents. So every now and then we—and normally this is not a problem, right? Most companies, most organizations have probably recompiled their BMS maps some time in the last two decades but even those that haven’t, it’s not hard to reconstitute those BMS maps so that we can in fact compile them or recompile them.
Reg: Now I assume that it’s still done—it’s been a few decades since I worked with CICS but I assume it is still essentially done as Assembler macros. Is that a fair observation?
Russ: Yeah. BMS maps have always been described using a set of Assembler macros where you define—you know, here’s the screen and then here’s the fields on the screen and this is where you decorate it according to highlighting and color. The genius of it looking back was what the BMS map did was it really did create a layer between the COBOL program that used the map and the outside world because again, it allowed the COBOL program to kind of be ignorant of where that data landed on a particular screen. That’s the job of BMS and the map, but it was hard to kind of get between them, right? It was hard to be able to kind of interdict that point of control until IBM created this API where we can in fact now interact with the application. So to your point, the application absolutely does not change. It doesn’t have to even be recompiled. It’s just—the application is kind of ignorant of the whole thing and we can still interact with it without any screen scraping.
Reg: Now I’m thinking about this. If IBM is providing you with the data as metadata, is it also providing you the information you would otherwise get from a copy book, or is still useful to have the copy books from the data variables as well?
Russ: We really don’t need the copy book because we can mine the BMS map in real time—and this all happens automatically. I mean lots of integration technologies out there require you to download your BMS maps and harvest all this data and do all this sort of stuff, and we with our technology you don’t have to do any of that. It’s just totally unnecessary because we’re living on the mainframe under the covers of CICS and so we’re able to inspect or look at the contents of that load module. So when a COBOL application says I would like to display a screen and the map is named X and the map set is named Y, then what we receive from the application is a statement of its intent. In other words, we’re sort of playing the role of BMS at that point.
Reg: Okay.
Russ: We receive the directive. I would like to send this map and here’s a data structure that has all the data that you will need to fill in that map. Well, if you’re actually using a terminal, then BMS goes about its business and says oh great. It’s a 24-row by 80-column terminal and I’ll decorate it like this. I’ll take those variables and I’ll plug them in all the right locations and boom, here it is on the screen. Well that’s what you would do if you were coming in from a screen; however—
Reg: Okay so your application thinks it is just sending it to another terminal, and you’re acting in place of that terminal.
Russ: Absolutely. We’re taking the role of BMS through this API, and so what we get is we get the directive from the application that says hey, I want to display this BMS map and map set and oh by the way, here’s a data structure with my data. What we do is we take that BMS map name and map set, we go out and grab the load module that has that information no differently than BMS would have done. We then go down, review the metadata in the BMS load module. Again we have no intention of interacting or generating a 3270 data stream; we just want to interact with the application. And so the metadata tells us where, how to decode its application data structure that the application sent to us. Well, great. So we know the field names. We have a data structure where all the data resides. We can grab that and now we can externalize it for application integration, and quite literally the 3270 data stream is never generated. It doesn’t exist. It is never generated as output nor do we generate it as input to the application because we are playing the role of BMS and interdicting that whole entire process.
Reg: Cool. Now when you take that data, are you putting it into XML or a combination of formats, or outside of optional formats?
Russ: Yeah, that’s a great question. In the first version of the product that we came out with, which we now kind of look back and say well it was lovely, but it was a good first start. What we did since this all grew up in the era of the early 2000s was we expressed that information in XML. And so Version 1 of the Hostbridge integration engine allowed someone to be able to send in an HTTP request with a set of parameters about the transaction they would like to invoke and passed in a number of variables to that transaction either as query parameters or in a payload. Then what we did was we invoked the transaction. We interacted with it, according to you know the directive, and then we returned the output as XML. Now that was kind of version 1 and what we began seeing—and of course JSON kind of wasn’t a thing then. Javascript wasn’t a thing then—and so XML made good sense. But what we saw customers do was you know, no one executes just one transaction, right? They’re trying to accomplish a work process and historically sorts of applications were kind of written such that you had to walk your way through a number of screens. And so what we saw our customers do was they would develop kind of these integration scripts and so it might take the form of a program that they write in Java, or back in those days Visual Basic or something like that. They would automate this series of requests, so they would send in an HTTP request to run Transaction A. They’d get the response back: they’d log it, they’d keep it, they’d mine it for whatever they want. They would then send in another HTTP request and get back data. We were working with our customers and we realized they were doing this dozens and hundreds and sometimes even thousands of times. We looked at that and we thought well, that’s just not efficient. That’s not anyone’s idea of efficient integration trying to orchestrate all that fine-grained activity from outside the mainframe. So we began looking at that with our customers and said, how do we solve that problem? I mean what would it look like if we could orchestrate all of these fine-grained interactions, these interactions that take microseconds or at most, milliseconds. What if we could orchestrate those at machine speed on the platform?
Reg: Oh.
Russ: In other words, is there a way that we could do that? Well we looked at that and we probably made for us—it felt kind of risky at the time. This was like 2005, so let’s see, 17 years ago. There was this thing—I’ll call it a thing. There was this language just on the horizon. It was called Javascript and it was clear that Javascript was about to kind of take over the world in terms of how we authored and operated web pages within a browser. That handwriting was on the wall. The takeover hadn’t quite, you know, been affected yet. A lot of people were still writing HTML pages absent Javascript in that era, but it was clear that Javascript was destined to become probably the most widely known programming language or syntax on planet Earth, which it is. So Scott Glenn and I, my Hostbridge cofounder, what we decided to do was to bet on Javascript, that it would not only take over the client side but it would take over the server side as well. So we embarked on what has now been a 17-year journey of developing and offering what is essentially the biggest, baddest server-side Javascript implementation on planet Earth. It’s called HB.JS, and that became our orchestration and automation engine that runs on System Z under CICS and on the zIIP specialty engine, assuming you have one. What this engine does is you send a single request to what we call HB.JS—and it’s rendered HB.JS to emphasize its Javascript orientation. So you send a request, a single request to the engine and then the engine follows the instructions that someone has written in Javascript, and then we can orchestrate a dozen, a hundred, even a thousand interactions with not only terminal-oriented applications but COMMAREA programs or Db2 calls or VSAM data—
Reg: Oh. Oh wow.
Russ: We perform all of that orchestration at machine speed and then we yield a single composite response back.
Reg: So the network latency is almost entirely eliminated.
Russ: Absolutely. It drops out of the equation and via and that whole latency build up—I’m glad you mentioned. I know we’ll talk about our analytics product in a minute, but now that we have a product that we can actually show customers the amount of latency build up, your comment is really spot on in that it is not uncommon for organizations to have built processes that are doing automation outside the mainframe—and by virtue of all the discrete interactions back and forth, they end up with this build up of latency. I mean we were working with one customer a couple of years ago with our analytics product where we were actually able to detect and calculate the latency. And, in this one organization, we were able to do some analysis just for fun and determined that—how did we say it? That they were wasting an entire person year of staff time every day. Every day they were wasting a person year waiting for automation scripts to run due to the latency. So what we did is we looked at this worldwide organization. We isolated all of this kind of screen scraping-ish activity and we added up all the latency and it was just staggering the amount of time that’s being wasted just due to that. So yeah to your point, once we move the point of orchestration onto the platform and perform it at machine speed, now you have these huge tradeoffs. So a work process that might have taken 10, 20, 30 minutes—I’ve seen them take five hours to run—now they take seconds [or] no more than minutes because we’re doing all things at machine speed and the whole workflow is not gated by the speed of your network.
Reg: That’s outstanding. I remember back in the 90s I was working with an ERP implementation that was talking to Db2 on the mainframe from distributed, and it was bringing down the mainframe practically. We did deep diagnostics and discovered what was happening is it was doing a bind-query-unbind [laughs] many, many times a second and it was bringing the mainframe to its knees, and that was you know, obviously using the available standards then to talk over the net to those things. It was still just you know without screen scraping being included, so when—
Russ: Oh yeah.
Reg: You throw in the additional overhead of 3270 and all that stuff.
Russ: Yeah. I mean that reminds me of a number of customer scenarios that we ought to get into—but your point is spot on, Reg, and it’s that when you orchestrate things, these fine-grained activities whether it’s running a transaction that given the speed of the mainframe it might run—a particular leg of a transaction might run in 200 microseconds, right? When you’re orchestrating that across even a modern—let’s say a 1-gig backbone network between two end points. When you’re orchestrating that at network speed, you’re still way out of whack, right?, in terms of time. For example that one customer I mentioned where we did a study of the business impact of all this latency, an exacerbating factor was that the servers on which they were running, kind of the automation that was interacting with these screen-oriented apps is located—I think it was located in Chicago and the mainframe was located at a data center down in Dallas—and there’s a 1-gig pipe between them. Well when these scripts start to run hot and heavy, what do you imagine is the biggest consumer of that 1-gig pipe, right? Well it’s all of this screen-scraping activity that’s going back and forth, back and forth—
Reg: Right.
Russ: And so that’s where this latency comes up. I mean I’ve seen this—it’s not uncommon. We’re looking at customer data every week and it’s not uncommon to see scenarios where you have distributed components outside the mainframe trying to automate something on the mainframe, and the level of intensive—the activity level, the intensity of interaction—is so severe that if this was a security context, we would call it a denial of service attack, right? And that’s really—
Reg: And this is with automation.
Russ: Right.
Reg: You know you want it as close as possible to what you’re automating. The further away you are, it’s like trying to automate something—you know, around Mars from Earth. You have such an incredible latency.
Russ: That’s exactly right and so that’s why we philosophically just decided to go for this solution that said okay, we’re going to be the team. We’re going to be the company that champions the cause of doing automation on the platform at microsecond speed and running on a zIIP engine if you have it so it is as efficient as we, or I think anyone might be able to conceive it as being. It makes all the difference in the world when you do it there. Now I’ll also throw in one of the things that our customers are also pleasantly surprised by: not also does latency drop through the floor for these business process, but also their CPU time goes down. Now how is that possible? Like by what laws of physics can you pull that one off? Well, it’s kind of interesting. You know sometimes we forget the overhead it takes just to convey that one interaction from something emulating a 3270 terminal, right? Yes, it went across the network. Now it entered the mainframe. Now it’s passing through VTAM and it’s probably going to go through a session manager and the session manager is going to pass it over to CICS. Then CICS is going to drag it through its terminal control layer and if it’s BMS, it is going to drag it through BMS. Then as soon as the application does its thing in you know, 50 microseconds or whatever, the output is going to transit all the way back through all of those layers on the mainframe. Now that’s measurable overhead, right? That’s measurable and with our integration engine or—I’m sorry, our analytics engine—now we can actually measure it and we can size it and we can see the wastage. And so the fact is that yes, doing automation on the mainframe—let’s say even if you run the exact same transactions, right? No change. I mean the transactions are still going to incur whatever overhead they have, right?
Reg: Sure.
Russ: But if we can eliminate a million, ten million discrete interactions with the mainframe via some sort of a 3270 interaction layer, if we can eliminate millions of discrete interactions—as we’ve said we not only save on latency, we actually reduce overhead on the platform—everything becomes more efficient.
Reg: Now I feel I’m being dragged by the power of segue into talking about your analytics engine, and I’m not going to let you go there yet because I’m still fascinated by the potential and power of what you have. I have a saying: the better you are, the more room you have for improvement. It’s like increasing your surface area. And so it sounds like you’ve taken this wonderful journey over the past two decades of discovering all the ways you can improve—and I’m going to guess that you’ve probably, in the back of your mind and probably in the front of your mind, have a million ideas for doing things even better—so I want to just plumb a couple of things you may already be doing or probably thinking about. And one of them is when I think about an optimizing compiler—obviously you want to be super careful not second-guessing what the user is up to when you get a whole giant list of things you want to do—but if you got binding and unbinding to the same place and stuff like that, what sort of things are you doing, or do you have the mind to do to take, you know, that giant piece of please do this for me and make it run even more effectively?
Russ: Well that’s a great question and I would say not a week goes by that we’re not talking to our customers and taking inbound requests and comments. I love our customers for many reasons, one of them being they are not shy to tell us exactly what they’d like for us to focus on. So right now in the engine, in the integration engine, we’re doing a number of things that have not only to do with speed and efficiency but also being able to tackle broader use cases. One of them—I’ll just say that we’re kind of intrigued by and adding features by is that or around is a lot of our. I mean if you own a mainframe today, let’s just stipulate you operate in a hybrid environment period.
Reg: Hmm. Right.
Russ: No one runs everything they have on the mainframe, at least not in our customer set, and so what they’re wanting to do is to build these integration scripts that not only communicate with data and applications running under CICS, they want to be able to also integrate with data and APIs off the mainframe. So imagine a request that comes in and to fire off an integration script—as we call it, running under CICS—and that integration script, then sure it goes. It runs a few transactions, links to a few programs, accesses some data, but let’s imagine that some data that’s required in this process is sitting over on a Microsoft Azure data lake. What we need to be able to do is fire an HTTP request over to an endpoint to be able to access that data and bring it back. Well, great. We can do that. Or let’s imagine that now once that business process concludes that before the data, the results flow back to the requester, we want to actually log something out to some sort of an operational data store. So we can make some sort of an outbound HTTP call from the CICS platform to be able to prep and move that activity data down there. Well that’s certainly doable in our platform now, and those are the sorts of things where we see our customers guiding us and driving us down, making our integration engine not just something that interacts with all of these mainframe-ish components. You would expect that, but all of these non-mainframe or hybrid components, because the services of the future are going to require data and apps from all of your hybrid sources, not just the mainframe or not just Azure or AWS or Google.
Reg: Now the one other thing I am aware of not having touched on here, before I let us go—Babbage on our analytics engine is the development and testing process of these scripts. What tools or what approach do you recommend in order to have a well QA’ed chunk of instructions being sent off to the mainframe?
Russ: That’s a good question. You know our customers take different approaches here, and so we are very free and kind of open-minded about kind of the development end of this. Most of our customers still use an Eclipse-based development model and platform—
Reg: Ah. Okay.
Russ: So we provide a plugin that just snaps into Eclipse for them to use. Many of them still use or are using Eclipse because they have their mainframe-based version control components that also nicely snap into an Eclipse framework.
Reg: Right.
Russ: Some of our customers now are beginning to use VS Code as a development platform, so we have an active project where we’re supporting our customers doing that and making sure that we support the VS Code approach. Ultimately where we see our customers going is using pure web-based tools to be able to do this, as opposed to some sort of a you know, on-workstation editor, whether it be VS Code or Eclipse. Many of our organizations are wanting to move to just a pure web-based edit-test-run-deploy sort of a model, and so that’s really strategically where we’re taking our development tooling. But again it all depends upon customers, their input and their requirements.
Reg: Sure. Okay, well Babbage will not wait any longer. Tell us about your not analytical engine—like Babbage, but your analytics engine [laughs].
Russ: Yeah. Well this is a story that’s near and dear to my heart, and again it starts with a customer. I got a call one day from a customer and it kind of went like this. They said hey Russ, we’ve got a problem—and if you want to make my day, that’s how you start the phone call—and he said here’s a reality: Our business is growing nicely—you know 8%, 10%, 12% per year—but our mainframe transaction volume is growing asymmetrically higher relative to our business, right? So, business growing nice, transaction volume growing asymmetrically to the underlying business activity. And it was like, what’s up with that? We cannot deduce why this is occurring. We know it’s occurring. We have an idea that it has to do with a bunch of screen-scraping activity that’s out in the field somewhere, but we can’t see it.
Reg: So it sounds like we’re talking about an order NlogN or order N² issue.
Russ: Oh, it is, and the data is compelling and so I had to say, you know, that’s an astounding recommendation. I mean I don’t have any magic thing on the shelf, but would you work with us? Would you collaborate with us to try to invent this? And they said sure. So we, over a period of weeks and months, conceived of a solution. That solution sent us headlong into a whole new area of the company that we now call Integration Analytics. The reason we call it that is that there’s no shortage of tools to help customers zoom in on what’s going on in and around their network outside the mainframe, and there’s no shortage of tools that let organizations zoom in on the mainframe and figure out oh yes, we ran transaction XYZ and on average it took 350 microseconds and did five I/Os, and we ran it 30 million times today. Great but there was nothing to let them see the forest for the trees. There was nothing to let them ask the question, now why are we running that transaction 30 million times a day and what is the business processes in which it’s tangled up? Who are the end users who are causing this? Now that is not obvious. There just aren’t a lot of good tools to answer that question, and so what we did is we authored some new intellectual property in and around how you could track this under the covers of CICS, and it took the form of a new US patent from Hostbridge. That’s neither here nor there, I guess, but we really took a fresh look at this and what we did, we said wouldn’t it be great if we could have just a tiny bit of very lightweight software running on the mainframe under CICS, and what this software would do is it would look over the shoulder of all of these input and output activities and grab a little bit of metadata, just little bits that would let us understand the context in which this is occurring. And then we’re going to take that metadata, and by handling it in a very precise way, we’re going to allow that metadata to naturally flow into the SMF records that are cut by CICS for every transaction. Now once we have that metadata, we call it the enriched information, right? Once we’ve enriched that SMF 110 data, now what can we do with it? We thought well, we need to be able to do some pretty sophisticated analysis on this, these patterns of interaction using a proper analytics platform, and so we decided to use Splunk.
Reg: Ah.
Russ: So what our integration analytics practice and technology revolves around is a very lightweight collection of software components that run on the mainframe under CICS, and they’re completely separate and distinct from our integration software components, but just very lightweight components. They grab little bits of metadata about each and every targeted category of request, whether it’s 3270 interactions, HTTP, socket interactions, MQ, whatever. And then we grab that metadata and then we flow it along with the SMF data down to Splunk, and then we can do some pretty sophisticated analytics to be able to show customers their patterns of interaction. So now they can actually see and visualize all of that—typically all that screen scraping activity, all of these sources of automation—and they can actually see where are the opportunities for improvement. What are the business processes that just beg to be optimized, right? And that’s really what the integration analytics technology is for. It’s to give customers for the first time a microscope so that they can actually see the forest for the trees. They can watch these interactions and they can size and determine the latency, the excess latency, the excess CPU burn, all of the operational implications of some of some of these legacy integration patterns.
Reg: Now I’m going to nerd out on you for just a second, and then I’m going to go back to use case that drove this. As I think about this, so you’re basically adding a bit to an existing SMF record that CICS is already cutting, and then you’re using some way of either paring down that SMF record just to the data you want and sending to Splunk, or sending the whole record to Splunk using some redirector. How do you do that?
Russ: Yeah, that’s a good question. I mean there’s a lot of flexibility here so I’ll describe it one way, but really customers can go about it however they want. We grab a number of different elements of metadata depending upon the use case. So for example if it’s 3270 applications, we can grab all the way down to the BMS maps that they’re working with, what AID key was entered because what we want to be able to do is actually see the pattern of interaction all the way down to the application level. We want a subject matter expert to look at, from the metadata kind of, have an ah ha moment as we say where we can say oh, I see what they were trying to accomplish. Well once we—
Reg: Okay so then you throw that onto SMF record that’s already on the way out of CICS?
Russ: That’s right. We add it to the SMF record. Now the SMF record contains lots of data, and so customers may or may not want to send all of those fields down, and they have lots of choices. But once that data gets down there, whatever set or subset they believe is important for them—it definitely always includes CPU time and all of the basic metrics—but once it gets down to Splunk, then we can do all sorts of analytic work down there. But again we’re not trying to answer questions that they can already answer. No one needs our integration analytics to tell them oh gee whiz, we ran transaction XYZ 20 million times. They already know that. What they need to know is why. What is the business process? Who are the end users, both by their IP address, by their terminal net name, by their user ID. Who are the people and what are the groups within these organizations that are running these high-impact business processes where they’re doing orchestration outside the mainframe, right? And so the whole idea is to let them see that. And Reg, I’ll just jump to the chase here.
Reg: Please.
Russ: It never fails that we find some of—first of all, we find some of the most embarrassing things that you can imagine so—but there are usually no more than eight to ten problematic patterns of interaction, and if you address those eight to ten use cases, you will have addressed 80% of the impact that they’re causing.
Reg: The Pareto principle.
Russ: It only takes eight to ten, absolutely, and that’s the beauty, we think. That’s the exciting thing to us is now we have a tool to be able to show customers that they don’t need to boil the ocean. Please don’t boil the ocean or even think you need to.
Reg: Right.
Russ: All you need to do is focus your time and attention on usually no more than eight or ten problematic patterns of interaction, and if you get those right, you will have achieved a level of optimization on both the end user side and under the covers of the mainframe that you’ll be very pleased with, and you will have saved money.
Reg: Now can you give us a little insight into the specific use case in terms of what was causing this geometric increase?
Russ: [Laughs] Yeah. Now I’m not going to name the customer—
Reg: Sure, of course.
Russ: But here’s what we found. So, for example, there is a big mainframe customer, and their salesforce all across the world uses a very sophisticated set of Excel spreadsheets to be able to interact with those applications. So the end user interacts with Excel and the Excel spreadsheets interact with the mainframe via—you got it—terminal emulation and screen scraping. Well it turned out that when a rep kind of pushed into a new order, the next thing they really want to know is when is that order accepted, right? Well okay, now so let’s imagine how would a programmer 20 years ago knowing nothing more than Visual Basic and HLLAPI, how would they solve that problem? How would they detect when the order status changes?
Reg: Enter, enter, enter [laughs].
Russ: Press the enter. And so at this organization, when we did the Pareto analysis on them to find what is the most expensive automated process, both in terms of real time and CPU time on the mainframe, it was a script that did nothing more than press the enter key every 200 milliseconds.
Reg: [Laughs] I call that winning the “Enter” prize.
Russ: That’s right. It would press the enter key every 200 milliseconds, and we found it would take in some cases given processing delays or complexity of the order, it would take anywhere from 10,000 to 20,000—
Reg: Oh my goodness.
Russ: Discrete interactions every 200 milliseconds for the status indication on a particular field to change from pending to accepted.
Reg: Wow [laughs].
Russ: Now so what happens if you know about the scalability thing? What happens as your organization grows and you add sales reps? It’s not rocket science, right? All of the you know, the monster kind of comes out the woodwork and now you have all of these macros doing all of this screen scraping and automation across distributed networks in arguably the least efficient way possible. So this customer was okay, we need to stop that, right?
Reg: Yeah.
Russ: We’re going to create an order status API that has a callback function, right?
Reg: Nice.
Russ: So instead of beating on the enter key every 200 milliseconds, let’s just make a one short API—or better yet, why don’t we just send an email out of the mainframe to the sales rep when the order status changes and so they can go off and do something else. Let’s make our business process more event-driven and save ourselves, oh, I don’t know, a few million transactions every day. We see a lot of things like that that are—honestly to the customers, they’re quite embarrassing, but we just have to say look, you’re not alone. We see this every day and our job is to show it to you and help you conceive of strategies and tactics to fix it.
Reg: Well this has been outstanding. We need to start winding down now. I don’t want to miss anything else you had in mind to share, but let me prompt you. On the one hand if there is anything else, innovation or experiences, you guys have dealt with – for the next five minutes – but also, I want to hear your take on the future. I mean you are in a position to really have a vision for the future—not just of what you hope and expect to see what but what you hope to make happen with your involvement in the ecosystem. So maybe just a few closing thoughts on that?
Russ: Yeah, well you know the future—let’s see. How would I say? On the one hand, it’s really exciting because we have all of these developments, at least in the world of the mainframe like IBM coming out the with the z16. I mean just an incredible technology platform—
Reg: The greatest computer ever created.
Russ: Unbelievable stuff and I mean I just can’t wait to get my hands and you know you know. When you’re a software vendor you have to kind of moderate because you really can’t get ahead of your customers. So we don’t have any customer with z16s but as soon as we do, there are all sorts of tools and technologies and goodies under the covers that we will exploit. But on the other hand, honestly as exciting as that is, sometimes I get a little depressed because I’ve spent the better part of the last 20 years trying to put a stake in the heart of screen scraping and emulation as an integration technique or technology. Where are we? Well we live in a world right now where one of the most popular and “modern” (in quotes) technology platforms is the use and deployment of robotic process automation platforms, RPA. Now if you open up your shiny new RPA platform box and you want to build an RPA that goes against the mainframe, what’s the easiest way to do that with? What are you tempted to do, Reg? What would you imagine?
Reg: Ah. You remind me of Kurt Vonnegut’s “Player Piano.” You know where he talks about these guys—basically all their movements are recorded but there’s no optimization. The movement is based on a legacy of humans doing it.
Russ: Exactly. So what do we see? We see as exciting as things are and as amazing as [recording cut out] mainframe that are continuing to perpetuate brittle, costly, inefficient ways of interacting with the mainframe. So our message is you know, there’s nothing wrong. We think RPA platforms are great, but please, please, please do not perpetuate the use of emulation and screen scraping as a methodology for achieving automation to the mainframe. Whether it’s our technology or whatever, use something other than that and you will be setting yourself up for success, not back-end mainframe scalability problems and ultimately, costly failures.
Reg: So, optimize forward. Just because something already works doesn’t mean that it’s the best way to do it going forward.
Russ: That’s right. Stop, stop, stop—please stop—end-using high volume terminal emulation and screen scraping as an integration technique. It is only going to make it more difficult for you to find your future on System Z.
Reg: Any other thoughts just before we finish up?
Russ: No, other than I always like to in these contexts just shout out to not only the folks at Hostbridge—we have just an amazing team of people—but also our customers. I mean frankly [recording cut out] customers so I would just say to your listeners if any of this sounds intriguing, just send us an email. Pick up a phone, call us. We are available and we love to visit whether you’re our customer now or you just have a use case you’d like to visit with. We’d love to visit and learn more about their requirements.
Reg: Outstanding. Well thank you so much, Russ. This has been fascinating and informative. I’ve really enjoyed it. So I’ll be back with another podcast next month, but in the meantime check out the other content on TechChannel. You can also subscribe to their weekly newsletters, webinars, ebooks, solutions directory, and more on the subscription page. I’m Reg Harbeck.