The Latest in RPG Built-in Functions
Jon Paris on the three new RPG Built-in Functions from the IBM i 7.4 TR4 announcement—%Upper, %Lower and %Split
By Jon Paris06/03/2021
%Upper and %LowerThe purpose of these BIFs is pretty obvious from their names and they do exactly what you would expect them to. Namely they take an input string and convert it to all upper (or lower) case.
At first glance these new arrivals might seem somewhat underwhelming. After all, we've all been coding our own versions of these functions for years—normally using %XLATE or SQL's upper/lower case capabilities. So why am I so happy to see them added to the language?
Two reasons: First, in recent years I have spent a lot of time training new RPG programmers. These trainees are usually experienced in other languages and expect such functionality to be available. The lack of this capability in RPG was a "black mark." Second, there is a growing awareness among IBM i users that their applications need to be able to handle text, particularly things like names and addresses, in languages other than English. In many cases, those languages include accented characters. I have seen so many cases where programmers failed to appreciate this and did not accommodate them with their home-grown "solutions." Now they have a simple, consistent approach for handling such data without having to manually account for every single character that could be required.
The table shown below demonstrates the results of running English text and its French equivalent through these BIFs.
|This is a mixed case string.||THIS IS A MIXED CASE STRING.||this is a mixed case string.|
|C'est une chaîne de cas mixte.||C'EST UNE CHAÎNE DE CAS MIXTE.||c'est une chaîne de cas mixte.|
Both of these BIFs provide optional second and third parameters. The second indicates the position in the text at which to begin the conversion—it defaults to the first position. The third indicates the number of characters to be converted—it defaults to the end of the string. So, for example, if I wanted to set the case of the entire string except for the first character I might code %Lower( inputString : 2 ). Similarly to convert only the first character to upper case I could code %Upper( inputString : 1 : 1 ). We'll see an example of this later.
%SplitThe attraction of this BIF is far more obvious. It takes a character string and breaks it up into individual elements. By default the separator is a space, but it can be any one of a number of characters specified by the programmer. The result is a temporary array which can be directly processed via a For-each loop or assigned to a regular array.
Let's start with a simple example of using it to break up a simple sentence into its component words. Since %Split returns a variable length array, I can use a simple For-Each loop to process the results. If you are not familiar with For-Each you can read more about it on our Authory page.
Dcl-S input1 VarChar(50) Inz( 'Words and still more words'); Dcl-S field VarChar(15); dsply input1; For-each field in %Split(input1); // By default splits on spaces dsply field; Endfor; If you are like me, your next thought might be to use this to split up a CSV string. Like this: Dcl-S input2 VarChar(50) Inz( 'Field1,Field2,"Third field",Last'); dsply input2; For-each field in %Split(input2: ','); // Comma separator dsply field; Endfor;There's a problem with this however—it may not operate in quite the way you expect. This is because %Split ignores consecutive separator characters. Now when the separator is a space, that is exactly what we want. But when dealing with CSVs it is not unusual to have consecutive separators because of omitted or null values. For instance, if there were no value for Field2 in the above example we would expect input2 to look like this: 'Field1,,"Third field",Last'. When we processed that input with %Split we would get only three elements returned and as a result would be in danger of processing the value "Third Field" as if it were the second field.
One way to deal with this is to modify the input prior to processing with %Split to ensure that there are never two consecutive delimiters. This code demonstrates one way to do this (an explanation follows the code):
Dcl-S string1 VarChar(50) inz('Field1,,"Third field",Last'); Dcl-S string2 VarChar(50); Dcl-S names VarChar(20) Dim(*auto: 20); Dcl-S name VarChar(20); string2 = %SCANRPL (',' : '@,@' : string); // Note # 1 dsply string1; dsply string2; // Displays: Field1@,@@,@"Third field"@,@Last names = %Split (string2 : ','); // Note # 2 For-each name in names; dsply name; // Shows Field1@ - @@ - @"Third field"@ - @Last names = %Trim(names : '@'); // Drop leading & trailing @ from all For-each name in names; dsply name; // Field1 - - "Third field" - Last Endfor;Note 1: We use %ScanRpl to replace all instances of a comma with the string '@,@'. The @ symbol is just a suggestion—any character that you know cannot appear in the original text will work.
Note 2: We then use %Split to break out the fields. Notice that in my example I have assigned the result of the %Split to a dynamic array for subsequent processing. I could have used a conventional array (and indeed on releases prior to 7.4 you will need to) but since the result of %Split is a temporary variable dimension array it seems like a good fit. It also simplifies the logic in that I can simply apply for-each processing to the array.
If you compile and run this code the display will make it obvious that there are now spurious @ signs in the resulting data. Luckily %Trim allows us to strip these off very simply and that BIF can be simply applied to the whole array.
This will not provide a complete solution for every single CSV situation that you encounter. There are still occasions when you will have to resort to CPYFMIMPF (shudder), or use an Open Access handler, or Scott Klement's CSV utilities (www.scottklement.com/csv/) for example. But in many cases this will do the job just fine.
Combining the BIFs
Just for fun I wrote a short program that combines the three BIFs I have described here, to upper-case the first letter of each part of a name. Basically it splits the full name into individual components, then converts each one into lower case before upper-casing the first letter of the result. It then strings the results together into a revised name. Here's the code:
dcl-s inputName char(50) inz('WILLIAM james williams'); dcl-s outputName varchar(50) inz; dcl-s names varchar(20) dim(*auto: 100); dcl-s name varchar(20); dcl-c blank ' '; names = %split( inputName ); for-each name in names; outputName += %Upper( %Lower( name ) : 1 : 1 ) + blank; endfor; // outputName now contains William James WilliamsAs you can imagine there is a lot more you can do with these new BIFs. I'd be interested to hear what uses you have been able to find for them.
Final Details and RDi NotesThese BIFs are available on both the 7.3 and 7.4 releases and require PTFs for both the compiler and the run time. You can find full details of the current PTF numbers required in the RPG Cafe.
One last word for our fellow RDi devotees. In recent times we have been spoiled and RDi support for new functionality has often preceded its availability. That is sadly not the case this time around and RDi will not recognize the new BIFs until its next release.
Jon Paris is a TechChannel technical editor.
See more by Jon Paris