Bound to Compile: Skipping the Source Code
Reg Harbeck on the advantages of bypassing source when optimizing binaries
By Reg Harbeck05/31/2021
Compiled Languages, or Scripts?If you’re doing the same thing day after day that involves processing massive amounts of data, you probably want your programs written in an industrial-strength compiled language that is optimized for the platform. If you’re just trying out a kludge to tie a pair of utility programs together for a one-time or occasional need, a script may be more suitable.
Given the vast amount of compiled COBOL in the world, it seems likely that the trade-offs often favor planning and compiling for such a deep dive. When the risks are great and the consequences significant for insufficient planning and validation, it makes sense to look down the road.
The quality, reliability, and speed that result from such planning are definitive for those world-class applications that process vast amounts of critical data, but these essential benefits do come with costs. Everything has to be considered and planned before the first program is sketched out in some kind of meta-code and then written in the language of choice.
Mutual OptimizationOnce you’ve written it in that language, it’s all just straight translation to machine code, right? Actually, while COBOL code sometimes looks like it could map one-to-one with IBM Z Assembler, advances in compilers and architecture and mutual optimization have left that simplistic perspective far behind.
57 years of mutual optimization can result in some amazing innovations, many of which require advanced education, training and experience to even begin to grasp at more than a marketecture level. So, if the System/360 mainframe was optimized to the business needs that were also implicit in the design of COBOL, the journey since then has been a profound interaction between these factors as the state of the art in science, technology and business has also hurtled forth.
Innovations in CompilingToday, we read about processor innovations like out-of-order instruction execution, and most of us are likely to realize that we’d need to let a computer program figure out how to rearrange a program we’d written to maximize the value of such innovations. And no, my genius colleagues who feel confident you could do such a thing, you shouldn’t, because the people who will inherit your code will not be likely to have the same depth of grasp of the technology to keep it properly maintained while being able to fill their other responsibilities as well.
Then there are other innovations that are even more amazing. What compiling is to speeding up an interpretive program, moving functionality from software programming to building it into the hardware is even more so.
Consequently, when IBM takes the massive decimal array math implicit in vast amounts of business financial processing and responds to that need with vector decimal instructions in the architecture, the resulting increase in speed from recompiling code to use the new instructions, without needing to modify the source, can be better than an order of magnitude improvement.
The Compiling ProcessBut the process of compiling is more than just planning and vetting and execution. There are several steps between writing up the program specs and having a running application, key among them being:
- Having an official repository for the source where you can track the current and previous versions of a program
- Having official sources for the related objects such as the COPYBOOKs for variables and other shared definitions and even code, not to mention JCL for compiling and linking
- Having official libraries to compile into for testing and eventual promotion to production of compiled and linked programs, aka load libraries
- Linking the compiled program into place so it has all the connections needed to production functionality. These days, linking is often referred to as “binding” in what I think of as a tip of the hat to binding database plans, both of which save time when the program is running by predefining connections rather than having them dynamically resolved at runtime.
Dusting Off Your Source CodeBut a funny thing happens when something works, unmodified, for a very long time. The source gets dusty. That is, people stop paying attention to it, because no one has touched it for so long. And maybe your hierarchical storage system moved it to tape a long time ago, and maybe that tape was moved offsite a long time ago. Maybe you don’t even have active drives that can read that format of tape—or, at the very least, you’ve lost track of which archived file it was that had the original source.
Now, to be clear, any active COBOL application should have gone through multiple iterations of recompiling as new version of the COBOL language (and mandates like Y2K compliance) came along. Each time should have allowed validation that the source matched the compiled version. And that should have allowed you to blow the dust off of the library and reactivate it so you didn’t have to worry about a mismatch. Should, should, should … should happens.
But, especially with the significant turnover implicit in the arrival of a new generation of mainframers, sometimes there is a loss of connection between the official source and the running programs. The advantages from recompiling for current architectural advances are of such value that it’s worth postponing the archeological search for the source and just cutting to the chase.
Bound to CompileIf interpretive programs can’t misplace their source, compiled programs can’t misplace their linked or bound load modules. If you are running a program, you know it’s the actual version that’s in use.
That’s where IBM’s Automatic Binary Optimizer (ABO) makes the connection. ABO literally recompiles load, or object, modules to take advantage of current architectural features and compiler advances.
With all of that freed up CPU and consequent budgetary elbow room, you can now take your new applications people and help them learn your context by doing some mainframe archeology and digging up the dino-source programs that are the original bones of your key production applications.
And that’s bound to bring advantages all around.
Reg Harbeck is a mainframe enthusiast who has worked IT and mainframes for over three decades. He's the chief strategist at Mainframe Analytics ltd.
See more by Reg Harbeck