Of Dials and Switches: Turning Dials, Flipping Switches
It's time to put what you've learned about tunables into practice.
Previously in this article series, I identified the many kernel tunables in AIX that control the performance of CPUs, memory, storage and networking. Then I told you how to determine each tunables’ default value, along with the valid range of values that can be applied.
In summary, we learned what the tunables do. Now in part 3, we’re ready to put this leaning into practice.
More Fun with Timeslice
First, a quick note of caution: The things I describe here should only be done on a test AIX system, a sandbox that management won’t miss if it crashes and burns. I’m sure you understand the perils of tinkering with a live production system, but let’s make it clear: This sort of experimentation could lead to some really bad situations, so don’t do it.
Now let’s go turn some dials and flip some switches! In this first tuning example, we’ll call upon an old friend, the CPU scheduling option (schedo) known as timeslice. Recall that timeslice is used to essentially fatten a 10-millisecond (ms) period by increasing the number of CPU clock ticks that can run in that same 10-ms period.
In figure 1, we see the output of a schedo -FL timeslice command, issued as root. Note the name of each tunable, its current, default and bootime values, as well as its minimum and maximum values. We next see the possible range of values we can apply to a timeslice, from 0 up to about 2 billion. Our unit of value is “clock ticks,” and the capital D at the far right tells us the timeslice value is dynamic; it can be adjusted freely without having to reboot the system. Figure 2 shows us the output of a help command, run against timeslice.
Of course it’s a good idea to fully understand what a tunable does before fiddling with it. On that note, probably the biggest takeaway from this brief help study of timeslice is that adjusting this value will only impact fixed priority processes. Remember this when you begin measuring the effects of adjusting timeslice on your test system. (An aside: Although process studies are beyond the scope of this article series, note that you can determine if a given process has fixed priority by issuing ps -el <enter>. See the NI―for Nice Value―column? Processes that run with a fixed priority have a double-dash [–] there.)
Now let’s just adjust the real-time value of timeslice upwards―say, by a factor of 1,000. Use this command:
schedo -o timeslice=1000 <enter>
Use this command to confirm that the value was changed:
schedo -FL timeslice <enter>
See the CUR column? That’s your current timeslice value. Again, it should now be 1000. To get the feel of how timeslice is adjusted, repeat the schedo -o timeslice=<value> command above, incrementing the timeslice value in powers of ten: from 1,000 to 10,000 to 100,000, etc. Rather than simply recalling your previous command for this exercise, you should actually type out each iteration. The extra effort is worth it. We’re altering system behavior here, so we can never be too cautious.
You may or may not begin to see effects on your system, namely, typing response. Now go up to the maximum value; this could be 2 billion or more, depending on the clock speed of your particular POWER processor. You should eventually get a warning message telling you you’re trying to exceed timeslice’s maximum value. It’s okay. Do a schedo -FL timeslice to see where you wind up.
Even on a test box, you don’t want a timeslice value to remain in the billions, so let’s dial it down. Should we reverse our command sequence, this time gradually adjusting our timeslice value downward until it’s back at the default level? Nope. Just enter the -d flag. Whether you’re working with CPU, memory, storage or networking tunables, applying -d will reset any value to its default. Here’s how it works with timeslice:
schedo -d timeslice <enter>
You should get a message that says, “Setting timeslice to 1.” A quick check with a schedo -FL timeslice will confirm that the value has indeed returned to its default. Easy. And again, -d can be applied to any tunable. The only catch is that, as it was on the way up, so shall it be on the way down. So if you had to reboot your system, run a bosboot or remount filesystems to make your initial changes stick, you’ll need to do the same when resetting that value to its default. Incidentally, you can reset whole groups of tunables―say, every schedo or vmo tunable―to their defaults by applying the -D flag, like this:
schedo -D
Depending on the subsystem you’re working with, you may see messages telling you a whole bunch of tunables have been returned to their defaults. The -D flag is most useful when you want to build an installable mksysb image, but don’t necessarily want all of your customized tunables to carry over to your new installations.
Make it Permanent
So that’s how you change the timeslice value. Note that, under these circumstances, whenever the system is rebooted, timeslice returns to its default value. Can you make your changes permanent? Of course―and naturally, we have a flag for that: -p (for permanent). There are any number of reasons to make a tunable change immune to a reboot; chief among these is if you’ve been directed to change a SCHEDO, NO, IOO or VMO value by IBM Support. Most of the time, this recommendation will be made to fix an existing performance problem. And as with most performance problems, they simply don’t vanish with a reboot; they’re usually with you for the long haul. So let’s say you want to adjust your timeslice value upward to 1000, and you want that change preserved across reboots. Simply add the -p flag to your value change:
schedo -p -o timeslice=1000
You’ll get this message:
#schedo -p -o timeslice=1000
Setting timeslice to 1000 in nextboot file
Setting timeslice to 1000
OK, the change is made. But what’s this nextboot file? Time for a little reconnaissance. Go to the /etc/tunables directory and do a long listing on its contents. At minimum, you’ll see three files in this directory: lastboot, lastboot.log and nextboot. To learn why we got a reference to nextboot when we made our timeslice change permanent, view that file. In the text, you’ll see these (among other) stanzas:
schedo:
timeslice = "1000"
There’s our change. The nextboot file exists to instruct AIX to maintain the tunable values that are listed within it. So those values aren’t returned to their defaults. This is how we make tunable changes immune to a reboot. When making any permanent change, always check your work by reading the /etc/tunables/nextboot file. This is especially critical in the case of tunable changes that require a reboot to take effect. Trust me, making sure you’ve done the change correctly with this simple check will save you time in the long run.
The other files in this directory are worth mentioning if you ever suspect a tunable was changed without your knowledge―and I’ve experienced this on numerous occasions. The lastboot file catalogs all system tunables and tells you which ones at their default values, as of the most recent system reboot. If a value was permanently changed, lastboot will let you know. Especially in a security audit, the lastboot and lastboot.log files can be invaluable.
Additional Tunables
Many tunable values require a system reboot, while some also require a bosboot be run. A few select AIX kernel tunables become active with only a reboot. Most of these are network (NO) tunables. Issue no -FL at a command prompt and scroll through the output. See the tunables with a capital R in the far right-hand column―with attributes like use_sndbufpool, inet_stack_size and netm_coalesce? They all require a reboot to go active. Simply refreshing or restarting the inetd super demon won’t get the job done. To ensure the kernel is aware of the change and update the /etc.tunables/nextboot file, you must reboot the system. The syntax for reboot tunables is basically the same as making a change permanent; simply substitute an -r flag for the -p flag:
#no -r -o use_sndbufpool=0
Setting use_sndbufpool to 0 in nextboot file
Warning: changes will take effect only at next reboot
Here I’ve disabled the caching of network memory buffer clusters that aid network performance. I check to make sure the change will be made active on the next system reboot by viewing my /etc/tunables/nextboot file. I’ve omitted some lines for brevity:
no:
use_sndbufpool = "0"
rfc1323 = "1"
And there’s my change to use_sndbufpool, right at the top of my network change listing.
Again, many tunable changes require a reboot and a bosboot. A bosboot, of course, creates―or recreates―a boot image with all of the necessary attributes to properly bring a system up to a running state. Tunables that require both a bosboot and a reboot have a cap B in the far right-hand column of their descriptions. This example features the memory tunable num_locks_per_semid, which increases the number of locks available per semaphore set. (You may have discussed this attribute with support in relation to backup issues.) Anyway, the syntax to adjust any tunable that requires a reboot and a bosboot is almost identical to the reboot-only syntax. There’s just one extra step:
#vmo -r -o num_locks_per_semid=2
Setting num_locks_per_semid to 2 in nextboot file
Warning: some changes will take effect only after a bosboot and a reboot
Run bosboot now? yes/no yes <enter>
bosboot: Boot image is 55324 512 byte blocks.
Warning: changes will take effect only at next reboot
Here I’ve raised the value of num_locks_per_semid, and then performed a bosboot of my system. Make sure to type in the reboot flag (-r) before the actual tuning option (-o) you want to change. Many admins transpose them or omit one or the other. I’ve certainly done this. It’s such a small thing, yet this hidden error can make your tuning efforts so frustrating.
In the networking area, you’ve probably noticed that many no tunables have a capital C in the far right-hand descriptor field. C stands for connect. Attributes with this designator can be freely changed and the system need not be rebooted for the change to take effect. However, the change will only be effective for future network connections; any connections already established will not pick up the change.
To illustrate this type of change, I’ve picked tcp_fastlo. This tunable is a little-known switch that lets TCP loopback traffic (the lo0 interface) shortcut the entire TCP/IP stack to achieve better performance. To change any connection-type tunable, use the generic option change syntax:
#no -o tcp_fastlo=1
Setting tcp_fastlo to 1
Change to tunable tcp_fastlo, will only be effective for future connections
Here you see the change made and the message telling you the change will only affect future connections. To make the change permanent, add the -p flag (before the -o, of course). You’ll then get a message that the /etc/tunables/nexboot file has been updated with your change.
With storage tunables―your IOO attributes―you need to watch out for the M attribute designator. This means you must dismount and remount a filesystem before your change will take effect. In a production environment where databases reside on a filesystem, this can be problematic. Given the time and complexity of stopping and restarting your database, you may as well as just reboot your system instead. I’m not sure why M even exists―it could be replaced by R―but it is what it is, so we must deal with it.
This example features the j2_maxUsableMaxTransfer attribute. This tunable affects the number of pages that can be gathered into a single buffer structure for the logical tracking group (LTG). This value is sometimes adjusted to aid volume group performance. Here’s the syntax, with the resulting output:
#ioo -o j2_maxUsableMaxTransfer=1024
Setting j2_maxUsableMaxTransfer to 1024
Warning: a restricted tunable has been modified
Warning: the j2_maxUsableMaxTransfer change is only effective for future mounts
I’ve doubled the value of j2_maxUsableMaxTransfer over its default. Also notice that j2_maxUsableMaxTransfer is a restricted tunable, and an additional warning is given to that effect.
Several other tunable designators are placed in the attribute description fields strictly for informational purposes. The tunables with these designators cannot be changed; they serve primarily as reference for admins and AIX developers:
S = Static: cannot be changed
d = deprecated: deprecated and cannot be changed
There’s one last category designator: I (for incremental). As of AIX 7.2, it consists of just two tunables, both network tunables that affect streams traffic. Adjusting them will prompt no special onscreen messages; the output looks the same as with a generic -o change.
Know the Tunables, Know Your System
And that’s how you twist dials and flip switches. One final thought: Even if you never change a single kernel tunable, learning what each tunable does will prove invaluable in your understanding of how the AIX operating system functions as a whole. Being able to integrate this knowledge into your daily activities will ultimately make you a more effective and more productive administrator.