Skip to main content

The Latest in RPG Built-in Functions

In the same announcement as the IBM i 7.4 TR4, three new RPG Built-In-Functions (BIFs)—%Upper, %Lower and %Split—were introduced. Let's take a look at what they offer.

%Upper and %Lower

The purpose of these BIFs is pretty obvious from their names and they do exactly what you would expect them to. Namely they take an input string and convert it to all upper (or lower) case. 

At first glance these new arrivals might seem somewhat underwhelming. After all, we've all been coding our own versions of these functions for years—normally using %XLATE or SQL's upper/lower case capabilities. So why am I so happy to see them added to the language? 

Two reasons: First, in recent years I have spent a lot of time training new RPG programmers. These trainees are usually experienced in other languages and expect such functionality to be available. The lack of this capability in RPG was a "black mark." Second, there is a growing awareness among IBM i users that their applications need to be able to handle text, particularly things like names and addresses, in languages other than English. In many cases, those languages include accented characters. I have seen so many cases where programmers failed to appreciate this and did not accommodate them with their home-grown "solutions." Now they have a simple, consistent approach for handling such data without having to manually account for every single character that could be required.

The table shown below demonstrates the results of running English text and its French equivalent through these BIFs. 
 

Original %Upper %Lower
This is a mixed case string. THIS IS A MIXED CASE STRING. this is a mixed case string.
     
C'est une chaîne de cas mixte. C'EST UNE CHAÎNE DE CAS MIXTE. c'est une chaîne de cas mixte.

 
Both of these BIFs provide optional second and third parameters. The second indicates the position in the text at which to begin the conversion—it defaults to the first position. The third indicates the number of characters to be converted—it defaults to the end of the string. So, for example, if I wanted to set the case of the entire string except for the first character I might code %Lower( inputString : 2 ). Similarly to convert only the first character to upper case I could code %Upper( inputString : 1 : 1 ). We'll see an example of this later.

%Split

The attraction of this BIF is far more obvious. It takes a character string and breaks it up into individual elements. By default the separator is a space, but it can be any one of a number of characters specified by the programmer. The result is a temporary array which can be directly processed via a For-each loop or assigned to a regular array.

Let's start with a simple example of using it to break up a simple sentence into its component words. Since %Split returns a variable length array, I can use a simple For-Each loop to process the results. If you are not familiar with For-Each you can read more about it on our Authory page. 

Dcl-S input1   VarChar(50) Inz( 'Words and still more words');
Dcl-S field    VarChar(15);
 
dsply input1;
For-each field in %Split(input1);  // By default splits on spaces
   dsply field;
Endfor; 
 
If you are like me, your next thought might be to use this to split up a CSV string. Like this:
Dcl-S input2   VarChar(50) Inz( 'Field1,Field2,"Third field",Last'); 
 
dsply input2;
 
For-each field in %Split(input2: ','); // Comma separator
   dsply field;
Endfor; 

There's a problem with this however—it may not operate in quite the way you expect. This is because %Split ignores consecutive separator characters. Now when the separator is a space, that is exactly what we want. But when dealing with CSVs it is not unusual to have consecutive separators because of omitted or null values. For instance, if there were no value for Field2 in the above example we would expect input2 to look like this: 'Field1,,"Third field",Last'. When we processed that input with %Split we would get only three elements returned and as a result would be in danger of processing the value "Third Field" as if it were the second field. 

One way to deal with this is to modify the input prior to processing with %Split to ensure that there are never two consecutive delimiters. This code demonstrates one way to do this (an explanation follows the code):

Dcl-S string1  VarChar(50) inz('Field1,,"Third field",Last');
Dcl-S string2  VarChar(50);
 
Dcl-S names   VarChar(20)  Dim(*auto: 20);
Dcl-S name    VarChar(20);
 
string2 = %SCANRPL (',' : '@,@' : string);  // Note # 1
 
dsply string1;
dsply string2;  // Displays: Field1@,@@,@"Third field"@,@Last
 
 
names = %Split (string2 : ',');  // Note # 2
For-each name in names;
   dsply name;  // Shows Field1@ - @@ - @"Third field"@ - @Last
 
 
names = %Trim(names : '@');  // Drop leading & trailing @ from all
For-each name in names;
   dsply name;  // Field1 -  - "Third field" - Last
Endfor; 

Note 1: We use %ScanRpl to replace all instances of a comma with the string '@,@'. The @ symbol is just a suggestion—any character that you know cannot appear in the original text will work.

Note 2: We then use %Split to break out the fields. Notice that in my example I have assigned the result of the %Split to a dynamic array for subsequent processing. I could have used a conventional array (and indeed on releases prior to 7.4 you will need to) but since the result of %Split is a temporary variable dimension array it seems like a good fit. It also simplifies the logic in that I can simply apply for-each processing to the array.

If you compile and run this code the display will make it obvious that there are now spurious @ signs in the resulting data. Luckily %Trim allows us to strip these off very simply and that BIF can be simply applied to the whole array.

This will not provide a complete solution for every single CSV situation that you encounter. There are still occasions when you will have to resort to CPYFMIMPF (shudder), or use an Open Access handler, or Scott Klement's CSV utilities (www.scottklement.com/csv/) for example. But in many cases this will do the job just fine.

Combining the BIFs 

 
Just for fun I wrote a short program that combines the three BIFs I have described here, to upper-case the first letter of each part of a name. Basically it splits the full name into individual components, then converts each one into lower case before upper-casing the first letter of the result. It then strings the results together into a revised name. Here's the code: 

dcl-s  inputName    char(50)  inz('WILLIAM james williams');
dcl-s  outputName   varchar(50)  inz;
dcl-s  names        varchar(20)  dim(*auto: 100);
dcl-s  name         varchar(20);
dcl-c  blank        ' ';
 
names = %split( inputName );
for-each name in names;
   outputName += %Upper( %Lower( name ) : 1 : 1 ) + blank;
endfor;
// outputName now contains William James Williams

As you can imagine there is a lot more you can do with these new BIFs. I'd be interested to hear what uses you have been able to find for them.

Final Details and RDi Notes

These BIFs are available on both the 7.3 and 7.4 releases and require PTFs for both the compiler and the run time. You can find full details of the current PTF numbers required in the RPG Cafe

One last word for our fellow RDi devotees. In recent times we have been spoiled and RDi support for new functionality has often preceded its availability. That is sadly not the case this time around and RDi will not recognize the new BIFs until its next release.