Skip to main content

Of Wildcards, Regular Expressions and Mainframe Security

As Sigmund Freud might say if he were alive today, “Sometimes, a dash is just a dash.” Of course, if he did, he’d likely be talking about one of the two wildcard characters that do all the heavy lifting in CA ACF2.

That’s right: ACF2 only has two wildcard characters—or actually, fewer than two: the asterisk and sometimes the dash. And they each wear several hats. RACF has two as well. Not to be outdone, CA Top Secret (TSS) has four!

But there’s only one that all three External Security Managers (ESMs) that run on z/OS (and in some cases z/VM and z/VSE as well) have in common: the asterisk.

Of course, some folks call it “star” or some other word beginning with “aster” (which means “star”). And what would computing be without bugs? And when you encounter a real live bug and memorialize it onto your computer screen with your hand, the result is sufficiently asterisk-like that you might, as some folks do, call it, “splat!”

But that splat means something different in each ESM. In RACF it means between one and eight of any characters excluding dots (periods).  In ACF2 it means exactly one of any character—or none if it’s immediately before a dot or at the end of a dataset or other resource name. And in TSS it means zero to eight of any character, including dots!

In case your head is beginning to spin, I’ve put together the following table to allow you to cross-reference the ESMs and their wildcards. So if you want to take a break, glance it over for a second, and then we’ll take a detour into the world of regular expressions.

Regular Expressions

A regular expression is a sequence of characters that specifies a search pattern. The concept was originally developed in 1951, and was elaborated variously in compilers, editors, command lines, and other contexts, resulting in some standards for how to construct them that emerged in the 1980’s and have been added to since.

At the heart of regular expressions is the concept of the wildcard character, which Wikipedia tells us, “is a kind of placeholder represented by a single character, such as an asterisk (*), which can be interpreted as a number of literal characters or an empty string. It is often used in file searches so the full name need not be typed.”

Here’s the thing: RACF, ACF2 and TSS were developed in the late 1970’s and early 1980’s—before regular expressions had settled into accepted standards. But wildcards were already an idea that people used. But just like in card games, where there is no one card that is universally “the” wildcard, so in mainframe security, there was no settled wildcard standard in the first two decades of System/360 and its various OSes and succeeding hardware platforms.

Thus, even the ubiquitous splat, while it showed up as the ace of wildcards, doesn’t suit every purpose. But it does suit purposes beyond substituting characters in text strings.

For example, sometimes it is used as a parameter, which couldn’t be combined with other characters, but just means variously, “whatever fits the bill,” or “everything of this kind,” or even “the last thing I was just working with,” depending on context and which ESM.

Dot’s Dot

One of the funny things about the unique personalities of the three ESMs is how they treat periods, or dots, in dataset and similar resource names.

RACF could have “Respect the Dot!” tattooed on it, probably right under an accompanying tattoo of the IBM logo. But then, that’s kind of how IBM mainframe software that was written by IBM often thinks: everybody do your part to enforce the standards! If dataset names must follow a specific format and dots are part of that, then security permissions must do their share, and recognize that the dot is more than just another character.

That’s funny from a non-mainframe perspective, because the dot has a different meaning in standard regular expressions—but it’s already reserved on our platform for separating “qualifiers” or “nodes” in dataset names, in that familiar one to eight characters optionally followed by a dot and one to eight more characters, repeatable up to a maximum of 44 characters.

That said, the star of RACF submits to the dot, at least when it’s solo. One star can represent any one to eight legitimate characters in a dataset name, except for the dot. But it does gang up when beside itself, as a double-star system implies anything at the end of a dataset name (or other resource name).

ACF2’s star not only doesn’t represent a dot: it can shrink from an encounter with one! Anywhere other than at the end of a string or immediately before a dot, an ACF2 asterisk is one and only one character. But it can become zero characters at the end of a qualifier.

TSS has no truck with such nonsense. That’s partly because TSS considers it plain that dataset names (and most other resource names) are just a prefixes, so you don’t need a train of trailing stars or anything else. And inside the string that represents dataset (or other resource) names, the star and dash are like the cartoon characters Asterix and Obelix—capable of anything—up to eight or 24 characters, of course.

There Can Only Be One!

We may have burnt out the ACF2 star’s roles by now, but there’s a % of the wildcard work under RACF that is also a plus under TSS. Namely, of course, that “%” under RACF is exactly one of any character, while “+” does that job under TSS.

But there’s also the weird case of one-to-one mapping of a wildcard or variable to the user ID (or LID in ACF2 or ACID in TSS). And ACF2 does that with &LID. RACF steps up with two variables—one for the user ID (&RACUID) and one for the current group profile name (&RACGPID). Good ol’ TSS is just like, “I see the percent in that.” So, in a sense, TSS also uses % for one character—but that character is the user, not just some random byte.

Mad Dashes

And that brings us back to the beginning of the article and the inscrutable dash. Of course, its origins were with the enigmatic if not inscrutable Barry Schrager, whose personality pervades ACF2. When I asked him why ACF2 does security evaluations using UID strings rather than the resource profiles in RACF or subsequent owned and permitted resource names in TSS, Schrager told me it was because of the mainframe’s architecture: it does string processing lightning fast, so minimal work is needed to evaluate security using that approach.

Whereas RACF uses two stars at the end of a dataset or other resource name to make it into a prefix, and TSS just says everything’s implicitly a prefix, Schrager gave us the dash to imply “and anything else from here on out.” But only when the dash is at the end. If it’s in the middle of a string, then it has to precede a period to mean “and any additional characters up to the next dot.” But if it’s prior to anything other than a dot or space, it’s just a dash, which is, after all, a valid character in a dataset name.

Sometimes, however, ya gotta dash to the next line when 80 characters aren’t enough to secure the desired result. So then our dashing hero stands alone at the end of the line, and tells ACF2, “I’m not done yet!” And ACF2 dutifully continues to evaluate contents of the following line as part of the security rule in question.

Hystorix

Looking back to the 70’s and 80’s, it’s easy to miss how much was still being decided and figured out when the three ESM-igos were showing up. Today, when we use stars and dots and other such characters on the Internet or other common contexts, we may have a somewhat fixed idea of what they mean, but that doesn’t mean that the early contributors to this approach are broken.

However, it’s a reminder that we sometimes need to take a break before coding up security rules, profiles and permissions, especially if we learned our stuff on one ESM and are now working with one or both of the other two. The differences—beginning with asters—must be dealt with. The cards may be wild, but there’s no need for our security to be.