An Introduction to Variable Field Lengths in RPG
Technical editors Jon Paris and Susan Gantner give a primer on this RPG feature.
By Susan Gantner
- As more and more people make use of SQL to define their tables, we see a corresponding increase in the usage of variable length fields. If those tables are to be used with native I/O operations, RPGers need to understand how to deal with them.
- We've recently seen a number of programmers moving to IBM i from other platforms, and for them, variable length fields tend to be the norm and so naturally, they use them on IBM i as well.
- Variable length fields offer many benefits to the RPG programmer as they can simplify and clarify logic, improve efficiency and just make things "better" all around.
- Last, but by no means least, when we reviewed our original article we found that it was all fixed-format RPG—including the calculations! *Shudder!*
Defining Variable Length Fields in RPGLet's begin at the beginning. Variable-length fields can be defined either internally within your RPG program, or on the database. We'll get to the database aspects later; for now we'll just look at defining and using variable length fields within RPG.
Definition within a program is a simple matter. The first example shows the old D-spec style definition where the keyword VARYING is used. In the new free-format (and all of our subsequent examples will use that format) you simply specify Varchar as the data type. These two field definitions accomplish basically the same thing:
D VaryingField1 S 200a Varying Inz Dcl-S VaryingField2 VarChar(200) Inz;
A varying-length field differs from a conventional character field in that the length you define (200 characters in the above examples) is the field's maximum size. At any given time, the current length of the field varies from a minimum of zero to the maximum of 200.
How does RPG know the current field length? Well, a variable-length field actually consists of two parts: an integer field that holds the length, followed by the actual data itself. Because of this, a variable-length field always occupies more bytes than its maximum length. For fields up to 65,535 characters in length, that integer is two bytes in length. For longer fields that length portion is four bytes.
Normally, you’ll never need to concern yourself with this. The only exception is when viewing such fields in debug. The debugger will normally just display the field like any other character field and it will appear to have trailing blanks. If you need to know if those blanks are really part of the field then you need to display the field in hex. i.e. EVAL variable name:x in the green screen debugger. You can then look at the contents of the first two (or four) bytes to determine the actual length of the field.
Note that in both examples we specified the keyword INZ to initialize the field to a zero length. This isn’t essential for stand-alone fields since their default initialization is to a zero length. However, the default initialization for all fields in a data structure is blanks—and blanks in the length portion of a variable length field can cause problems. As a result, we tend to always be explicit and specify initialization for any variable length field.
Using Variable Length Fields in RPGThe great thing about variable-length fields is that it makes it so simple to form compound strings such as when building CSVs, XML or JSON.
For example, suppose you needed to take all the elements in the array monthlySales and form them into a single comma separated string. Using a fixed length character field you might write logic like this:
For i = 1 to %Elem(monthlySales); fixedString = %TrimR(fixedString) + %Char(monthlySales(i)); If i < %Elem(monthlySales); fixedString = %TrimR(fixedString) + ','; EndIf; Endfor;
In this example, the process to add a new element requires that any trailing spaces first be trimmed from the current content before the new data is added. But of course, RPG will add those spaces back in to pad the field to its full length. For all except the final item, this trim and pad process is repeated to append a comma to each element added to the data.
This action takes place for each element added to the string. If you think that this all sounds very inefficient you are absolutely correct. It is horribly inefficient and the bigger the target string the worse it gets.
However, by using a varying length result field instead, the logic looks quite a bit simpler—and even more importantly, performs significantly better.
For i = 1 to %Elem(monthlySales); varyingString += %Char(monthlySales(i)); If i < %Elem(monthlySales); varyingString += ','; Endif; EndFor;
In this second version there’s no need to trim anything as RPG always knows where the end of the field is. So when adding data, RPG simply places it following the last character and updates the length accordingly with no blank padding.
Since we used %Char in this example, there are no trailing spaces involved. But suppose we were adding the contents of character strings that could have trailing blanks. What then? The answer is to simply trim the string as it is added. Like this:
varyingString += %TrimR(charField);
Another common use is for building up a person's name from individual first and last names. If the names are stored in conventional fixed-length fields, the code would look something like this:
fullName = %TrimR(firstName) + ' ' + lastName;
If, on the other hand, we had stored the names as variable-length fields, we could use the simpler (and certainly more intuitive) code:
fullName = firstNameV + ' ' + lastNameV;
When using fixed length character fields, we’re all accustomed to setting them to blanks when there’s no valid content. With a variable-length field, we can actually set the field to have no value. There are two ways to accomplish this in RPG.
We could set the field to a null string (represented by two consecutive single quote characters) which will result in RPG setting its active length to zero.
Alternatively, we could use RPG's %LEN built-in. Normally we think of using %Len to determine the current length of the field. For example, if we needed to know if there was enough room to fit it onto a print line. But we can also use it to directly control the length of a varying length field. As a result, the following two lines of code will achieve the same thing—setting the field myVaryingField to null.
myVaryingField = '';
%Len(myVaryingField) = 0;
Variable-Length Fields as Parameters
Last but not least, we should mention what is probably the most neglected use of varying-length fields: their ability to increase the versatility of called programs and procedures by using them as parameters even when the callers need to pass fixed length fields.
Suppose we wish to write a utility routine that accepts a character string of from 1 to 200 characters in length. The routine's callers may want to pass fixed length fields. If the parameter were to be defined as a fixed length field, then in order to process this string correctly, we would also need to require the caller to pass an additional parameter to specify the actual string length. You will probably recognize this approach as the way QCMDEXC works. (The first parameter to QCMDEXC identifies the command string and the second its length.) While we can't do anything about QCMDEXC, we can improve our own programs.
If we replace a fixed length input parameter field with a varying length one, we no longer need to require the length as a separate parameter. Instead, our called routine can simply interrogate the current length of the parameter field passed to it. We simply need to add CONST or VALUE to the parameter definition and the compiler will take care of things. In fact, RPG will set the length of the passed field to the actual length of the fixed form field that we passed in and the called routine can use %LEN to establish the size of the original parameter.
This short program demonstrates the idea. First, we call the routine with a 20-character field. Then we call it again with a 40-character field. RPG sets the length of the variable length parameter accordingly. The logic in VaryParm demonstrates this by correctly reporting that the parameter in the first call was 20 characters long and in the second call was 40 characters long.
Dcl-S firstName Char(20); Dcl-S fullName Char(40); VaryParm(firstName); VaryParm(fullName); Dcl-Proc VaryParm; Dcl-Pi *N; // Const allows fixed char or varchar inputString Varchar(80) Const; End-Pi; Dcl-S parmLength Int(5); parmLength = %Len(inputString); Dsply ('Processing a ' + %char(parmLength) + ' char parameter'); End-Proc;
Hopefully by now you appreciate that variable length fields are a valuable addition to your RPG toolkit. So what about defining them in the database?
Defining Variable-Length Fields in the Database
Variable-length fields can be defined using either DDS or DDL. Let's look at the traditional DDS method first. We defined a sample table PRODUCTLV with DDS like this using the VARLEN keyword:
DESCRIPT 32A VARLEN(32)
FULLDESCR 2000A VARLEN(200)
If you’re among the growing number of IBM i shops that define their tables with DDL then you can achieve the same by using the data type VARCHAR. The table definition ends up looking something like this:
CREATE TABLE PARTNER400.PRODUCTVL (
PRODCODE CHAR(5) NOT NULL,
DESCRIPT VARCHAR(32) ALLOCATE(32) NOT NULL.
FULLDESCR VARCHAR(2000) ALLOCATE(200) NOT NULL )
Our table contains two variable-length fields: DESCRIPT and FULLDESCR. They’re defined as 32 and 2,000 characters in length, respectively.
You may be wondering what is the purpose of the ALLOCATE(nnn) keyword in the definition. Well, varying LENGTH fields offer the possibility of reducing the disk storage required for a table in some circumstances.
Suppose that while our FULLDESCR field can be as long as 2,000 characters our typical description is likely only around 100 to 200 characters long. That means that in a database of 1,000 products, we could have wasted some 1,800,000 bytes of storage (i.e., 1,000 records which typically waste around 1,800 bytes each).
By including a length in the definition, we’ve told the database to reserve only 200 characters in the fixed portion of each record for the description. Any characters beyond that limit will be stored in a special area known as the auxiliary portion of the record. The number in parentheses following the VARLEN keyword in DDS serves the same purpose as ALLOCATE.
What happens if we don't specify an allocated length? In that case all of the data for these fields will always be placed in the auxiliary area.
If we were right about our assumption that most values in FULLDESC will contain 200 or less characters, then the descriptions for the majority of our records will use only 202 bytes (i.e., 200 allocated bytes + 2 bytes to store the actual length.) Only those that exceed the 200-character limit use additional space. All of this is transparent to the programmer.
All good, right? Well before you rush off and redefine all of your larger character fields as varying length, there are some other considerations. This ability to save disk space is not entirely without cost.
First, in order to keep track of the data in the auxiliary area, an additional 25 bytes is added to any record that includes variable length fields. As a result, you wouldn’t really save any space by defining (say) a 50 character field as varying length unless it’s rarely used.
Secondly, any record which has data stored in the auxiliary area requires a second I/O operation under the covers to retrieve that data (i.e., effectively, there will be two reads for each record). This will obviously slow the processing down.
Because of these "costs" we normally use an 85/15 rule when determining the size of the allocated area. By this we mean that we aim to ensure that 85% of all records will fit within the allocated length and only 15% will exceed it and therefore require use of auxiliary storage. It is up to you to decide on the appropriate ratio for your tables.
Hopefully we've given you some food for thought and you'll find more uses for this versatile tool. Please comment and let us know how you've put them to work in your code.
Susan Gantner's career has spanned more than 30 years in the field of application development.
Jon Paris is a TechChannel technical editor.