Varying-Dimension Arrays for RPG
The new IBM i 7.4 announcement brought with it varying-dimension arrays, a new data definition keyword and more
It’s Christmas in springtime for RPGers. Whenever we have a new release of RPG there is always something of a Christmas-like anticipation of just what Santa Barbara** has brought us this time. We don’t know exactly what it will be, but we’re pretty sure it will be useful and we’ll be thrilled to unwrap it. (** For those who think that “Santa Barbara” is a rather strange geographical reference, we should point out that Barbara Morris is the team lead and chief architect for the RPG compiler. Hopefully that all makes more sense now!)
With the IBM i 7.4 announcement, Santa certainly did not disappoint and we have a terrific new feature that we have wanted for a long, long time—dynamic arrays or as IBM prefers to call them “varying-dimension arrays.” In addition, we also have a new data definition keyword, SAMEPOS, which makes it easier to redefine data items without the restrictions of OVERLAY. Last but not least, there are also two new fields available in the PSDS. We’ll talk more about SAMEPOS and the PSDS enhancements later, but for now let’s focus on the enhancements to arrays.
With the IBM i 7.4 announcement, Santa certainly did not disappoint and we have a terrific new feature that we have wanted for a long, long time—dynamic arrays or as IBM prefers to call them “varying-dimension arrays.” In addition, we also have a new data definition keyword, SAMEPOS, which makes it easier to redefine data items without the restrictions of OVERLAY. Last but not least, there are also two new fields available in the PSDS. We’ll talk more about SAMEPOS and the PSDS enhancements later, but for now let’s focus on the enhancements to arrays.
Dynamic Arrays
Before we begin, we should point out that these array enhancements are available for 7.4 only. They will not be made available as PTFs for 7.3. So you may have to wait a while until you can use this new feature, but it will be worth the wait.
If you’re like us, whenever you set up an array you often find yourself conflicted as to whether to make the array as big as it could ever conceivably need to be, or to set a more practical limit. The first risks wasting memory and the second risks program failure. In addition, of course, we inevitably do this in the knowledge that no matter what choice we make, one day it is just not going to be big enough!
Those days are now behind us—and we can now set the maximum size of the array to a suitably massive value, secure in the knowledge that RPG will only use enough memory to hold the active content. Even better, because RPG is keeping track of the highest element in use in the array, we can use operations such as SORTA and/or %LOOKUP without having to consider using %SUBARR to restrict the operation to active elements. That alone is good reason for us to use this support.
If you’re like us, whenever you set up an array you often find yourself conflicted as to whether to make the array as big as it could ever conceivably need to be, or to set a more practical limit. The first risks wasting memory and the second risks program failure. In addition, of course, we inevitably do this in the knowledge that no matter what choice we make, one day it is just not going to be big enough!
Those days are now behind us—and we can now set the maximum size of the array to a suitably massive value, secure in the knowledge that RPG will only use enough memory to hold the active content. Even better, because RPG is keeping track of the highest element in use in the array, we can use operations such as SORTA and/or %LOOKUP without having to consider using %SUBARR to restrict the operation to active elements. That alone is good reason for us to use this support.
Types of Dynamic Arrays
So how do you specify these new types of arrays? There are basically two types – *VAR (Variable) and *AUTO (Automatic). With the Automatic version RPG will always ensure that the array is large enough to hold the highest index value that you use. If the array currently has 5 active elements and you assign a value to the 25th element, RPG will automagically increase the size of the array to 25.
The actual syntax for defining these arrays is:
The actual syntax for defining these arrays is:
ArrayName DIM ( type : MaximumElements )
The MaximumElements value is the largest size that the array can ever be. Think of this as a “safety valve” to prevent a logic (or data) error from causing a program to use ridiculously high index values. So an array defined as:
Dcl-S myAutoArray Char(10) Dim ( *Auto: 1000 );
Starts life with zero active elements and would grow to 900 if the following code were executed.
myAutoArray(900) = 'Highest';
Any attempt however to access an element higher than the limit of 1,000 would cause an error, just as it would with a conventional RPG array defined as Dim(1000).
Contrast this with an array defined as:
Contrast this with an array defined as:
Dcl-S myVarArray Char(10) Dim ( *Var: 1000 );
In this case, without other action on the programmer’s part, even an attempt to use an index value of 1 would result in an error. This is because this type of array also starts with zero active elements, but will only grow when told to do so. So how do we tell it to grow? Through use of an old friend in a new role – the %Elem() built-in. We can now use %Elem() on the left hand side of an expression to set the current number of active elements. So if we wanted to place a value in element 900, we would first have to code:
%Elem( myVarArray ) = 900;
And then we could do:
myAutoArray(900) = 'Highest';
The value returned by *Elem() for both of these new types of arrays is also slightly different from conventional arrays. For these types of arrays It will return the current active size, as opposed to the maximum size. If at any time you do need to know the absolute maximum size for the array you can obtain this by specifying the keyword *Max. For example, if you wanted to ensure that an array index for an *Auto array would not cause an error you could code:
If index <= %Elem( myAutoArray : *Max );
*NEXT
*AUTO arrays also offer another nice feature. You don’t actually have to keep track of the highest index used. Simply specify the index value as *Next and RPG will work it out for you. So we can code something like this:
*AUTO arrays also offer another nice feature. You don’t actually have to keep track of the highest index used. Simply specify the index value as *Next and RPG will work it out for you. So we can code something like this:
Dcl-S myAutoArray Int(5) Dim( *Auto : 999 );
Dsply ('Start: Active elements = ' + %Char(%Elem(myAutoArray)) );
For i = 1 to 50;
myAutoArray( *Next ) = i;
EndFor;
Dsply ('End: Active elements = ' + %Char(%Elem(myAutoArray)) );
Dsply ('Maximum capacity of array is: '
+ %Char(%Elem(myAutoArray : *Max)) );
This is more in line with the way most other modern languages treat arrays and should make life a little easier for all of us, but particularly for newcomers to RPG.
(Not Quite So) Dynamic Arrays
We mentioned earlier that there were two main types of dynamic arrays, but there is in fact a third variant which is not quite so dynamic. This option is requested via the keyword *CTDATA and as you have probably guessed it relates to compile time data arrays. Instead of specifying the number of elements in the array you simply specify *CTDATA and the compiler will set the array size based on the number of compile time data entries found. That is a nice feature but frankly not enough to overcome our dislike of compile time arrays so we’ll pass on that one.
Limitations and Other Considerations
This new support is not universal. That is, it is not (yet) available everywhere that you can specify a Dim(). It can only currently be used on a top-level definition. i.e. a stand-alone array or a DS array. It cannot be used on an array that is nested within a DS, for example. Nor can it be used in defining parameters. You can pass these arrays as parameters, but there are a number of considerations, and additional options, to take into account. We will be covering these in a subsequent article.
An additional limitation is that varying dimension arrays cannot be used in old-style calc specs. This isn’t much of a limitation really since anyone who hasn’t changed their coding style to free-form in the last 18 years is not likely to use anything as modern as varying dimension arrays in the first place.
This is more of a warning than a limitation really, but you need to be aware that any time you increase the current size of the array the system may need to move it in memory. As a result, if you use %Addr to obtain its address (for example to pass the array to a C-style API) you must refresh that address prior to use or risk passing the wrong data.
Similarly, since the debugger is blissfully unaware that the array size may change, you must avoid trying to access elements beyond the current size while debugging or you’ll get confused. RPG provides the special variable _QRNU_VARDIM_ELEMS_arrayname which contains the current maximum and can be displayed in debug to confirm the current element count. Note however that this a copy of the true variable and so changing its value during a debug session will have zero effect.
An additional limitation is that varying dimension arrays cannot be used in old-style calc specs. This isn’t much of a limitation really since anyone who hasn’t changed their coding style to free-form in the last 18 years is not likely to use anything as modern as varying dimension arrays in the first place.
This is more of a warning than a limitation really, but you need to be aware that any time you increase the current size of the array the system may need to move it in memory. As a result, if you use %Addr to obtain its address (for example to pass the array to a C-style API) you must refresh that address prior to use or risk passing the wrong data.
Similarly, since the debugger is blissfully unaware that the array size may change, you must avoid trying to access elements beyond the current size while debugging or you’ll get confused. RPG provides the special variable _QRNU_VARDIM_ELEMS_arrayname which contains the current maximum and can be displayed in debug to confirm the current element count. Note however that this a copy of the true variable and so changing its value during a debug session will have zero effect.
Enhancements You Can Use Now
Time to take a look at those enhancements that have also been PTF’d into 7.3. In fact they are already available – you can find the PTF details here and here.
SAMEPOS
This keyword doesn’t provide a new capability – but it does make certain types of definitions easier to implement and more obvious to those who come after. For example, back in 2003 we published an article called D-Spec Discoveries. You can read it here.
In that article we discussed techniques for redefining contiguous fields as arrays. Those techniques still work, but the new SAMEPOS keyword makes life much simpler. Assume that we have a file that contains 12 contiguous fields, one for each month of the year (JanSales, FebSales, etc.) In order to be able to treat those monthly sales figures as an array all we need to do is to code an externally described DS based on the file like so:
In that article we discussed techniques for redefining contiguous fields as arrays. Those techniques still work, but the new SAMEPOS keyword makes life much simpler. Assume that we have a file that contains 12 contiguous fields, one for each month of the year (JanSales, FebSales, etc.) In order to be able to treat those monthly sales figures as an array all we need to do is to code an externally described DS based on the file like so:
Dcl-Ds SalesData Ext;
MonthlySales Like(JanSales) Dim(12) SamePos(JanSales);
End-Ds;
Simple and clean. SAMEPOS identifies the starting position for the item but, unlike the POS keyword, or even the old from/to notation, it is soft-coded. As a result, if the layout of the data changes it will automagically adjust with the next compile—a much safer way to do it.
There are many other scenarios where this feature comes in handy. One that we have encountered recently concerns finding an effective way to handle multi-format records, such as System 36 style header/details types of file. These are not as uncommon as you might think and can be found in mainframe conversions and in some types of EDI records. Here’s an example of an S/36 style file and how it can be handled without resorting to I-specs, pointers, moving data around, or any of the other techniques we used to use.
There are many other scenarios where this feature comes in handy. One that we have encountered recently concerns finding an effective way to handle multi-format records, such as System 36 style header/details types of file. These are not as uncommon as you might think and can be found in mainframe conversions and in some types of EDI records. Here’s an example of an S/36 style file and how it can be handled without resorting to I-specs, pointers, moving data around, or any of the other techniques we used to use.
Dcl-F S36File Disk(40); // Program described input file
Dcl-Ds RecordLayout Qualified;
// Common fields
CustNo char(5);
RecordId char(1);
OrderNumber zoned(5);
// Layout for header
Dcl-Ds Header;
OrderDate date(*USA);
ItemCount int(5);
OrderTotal packed(7:2);
End-Ds;
// Layout for Detail
Dcl-Ds Detail SamePos(Header); <<===
ItemCode char(7);
ItemQty packed(5);
ItemPrice packed(7:3);
End-Ds;
End-Ds;
Read S36File RecordLayout; // Load record into DS
As you can see we defined the layout of the Headerinformation as a nested DS following the common fields. We then defined Detailas a DS that starts in the same position (SAMEPOS) as the Header. If there were, for example, a Total record it would be defined in a similar manner. This is by far the cleanest way to handle these situations that we’ve seen in RPG.
There are many other uses for this new keyword. For example, you could group together separate day, month and year fields into a single date field. We’re sure you’ll find your own uses.
There are many other uses for this new keyword. For example, you could group together separate day, month and year fields into a single date field. We’re sure you’ll find your own uses.
PSDS Additions
Two new fields have been added to the PSDS. The first is the 16 byte Internal Job Id. This can be found in positions 380 to 395. Previously obtaining this information required calling the QUSRJOBI API so this is a far simpler approach. The second new field comes in the category of “We can’t believe this hasn’t always been there”—it’s in the 8-character system name! You’ll find this in positions 396 to 403. Small enhancements, but useful nonetheless.
That completes our initial review of the new RPG features. We’ll be back with a follow-on article where we’ll discuss some of the more esoteric aspects of the Dynamic Array support, and in particular how to use these new-style arrays as parameters.
Until then, if you have any comments or questions please let us know via the comments section.
That completes our initial review of the new RPG features. We’ll be back with a follow-on article where we’ll discuss some of the more esoteric aspects of the Dynamic Array support, and in particular how to use these new-style arrays as parameters.
Until then, if you have any comments or questions please let us know via the comments section.