Of Dials and Switches: More about Tunables
Learn how to obtain more detailed information on each tunable.
Last month, I described how the AIX kernel―and operating system kernels in general―have evolved over the past 30 years. As advances in hardware required greater administrative control, the number of tunables and tuning scenarios spiked. I then explained how to display the CPU tunables scheduling options (aka, the schedos) in several different formats and shared what each of the fields in the resulting output reveal.
For this―the second installment in this series―I’ll explain how to obtain more detailed information on each tunable. But before getting into that, I want to tell you about another way to display all of the AIX CPU, memory, storage and networking options as a group and write that information to a report.
Generating a tunables.sum File
Go to your PerfPMR installation directory. PerfPMR is the indispensable IBM tool―it’s actually a suite of dozens of small programs―for diagnosing AIX performance issues. It should be installed on all your AIX systems. If you don’t have PerfPMR, download it from IBM’s Fix Central.
The PerfPMR installation directory houses a large script called config.sh. As the name suggests, config.sh queries your system for its configuration details and writes all of that information to a bunch of different files. Among those files is a tunables summary called tunables.sum. This lists all of your CPU, memory, networking and storage tunables in both terse and expanded formats (which is covered in part 1). I’ve written extensively on the process of executing individual PerfPMR scripts, but it’s not necessary to run the full PerfPMR suite to generate tunables.sum. Simply run config.sh on its own, and you’ll have your tunables list in a snap. Note that you must be working as root. Also, PerfPMR can’t be run from its installation directory, so pick another place to record your data. I usually create a directory in /tmp (e.g., /tmp/perfdata). Enter this at your command prompt:
perfpmr.sh -x config.sh -kaglpsmu
This string generates your tunables.sum file―and yes, there are eight letters in that second flag. By singling out this file, which is less than 10 percent of all the information gathered by config.sh, it takes less than a minute to run. All the tunables.sum files you collect from your AIX systems should be stored in a folder separate from your big configuration binders. This will give you quick access should you ever, for instance, need to engage IBM Support about these values.
Digging Deeper
Now let’s return to those dials and switches. We’re all familiar with UNIX man pages. I doubt there’s an administrator out there who hasn’t spent hours poring through these online help files to learn the finer points of a command. Anyway, the man pages for the schedos (and vmo for memory, ioo for storage, no for networking, nfso for NFS and raso for reliability, availability and serviceability) have high-level information about manipulating CPU and memory tunables. However, they don’t really tell you what individual tunables do, only that they can be adjusted. Fortunately, there’s more than meets the eye here. Unlike most other AIX command man pages, those for the kernel tunables are nested. So if you want help on any particular tunable, use the -h flag. Take timeslice as an example. Timeslice is the CPU tunable that lets you adjust the number of clock ticks a thread can run on a CPU. Enter this at a command prompt:
schedo -h timeslice
You’ll get this output:
lpar# schedo -h timeslice
Help for tunable timeslice:
Purpose:
The number of clock ticks a thread can run before it is put back on the run queue.
Values:
Default: 1
Range: 0 - 2147483647
Type: Dynamic
Unit: clock ticks
Tuning:
Increasing this value can reduce overhead of dispatching threads.
The value refers to the total number of clock ticks in a timeslice
and only affects fixed-priority processes.
Not bad. This gives us a decent idea of how to adjust the timeslice parameter as well as the possible values we can apply to it, plus we can see why we might adjust timeslice to begin with. The -h flag allows you to access this contextual help with any AIX tunable. Try it for yourself with any of the SCHEDO, VMO, IOO or NO tunables. I’ve printed this instructional help on every AIX tunable and keep it in a binder. As the years go by and new AIX versions are released, additions and subtractions are invariably made to my list. Believe me: The time I’ve spent logging this invaluable information is well worth the effort. I encourage you to start and maintain your own volume of AIX tunables.
Unrestricted vs. Restricted: The Dividing Line
One final note on tunables: Chances are you’ve done a schedo -FL, or you’ve applied the -FL flags to vmo, ioo and/or the no commands, and noticed a clear division in the first and second halves of the tunable list. This boundary marker is the same regardless of the subsystem you’re querying. It looks like this:
##Restricted tunables
What’s this about? Simply this: the tunables listed after this declaration are a class unto themselves. They were put there by IBM’s AIX developers. These folks who actually write the operating system code feel it’s important to establish a clean division between these restricted tunables and the unrestricted tunables of any subsystem.
Generally, unrestricted tunables can be altered freely. If you open a support call, you shouldn’t get much flack for changing the value of any unrestricted tunable. Changing the value of a restricted tunable is another matter entirely. IBM could go as far as to withhold support until the restricted values are returned to their defaults. At the very least, support will encourage you to return the restricted tunable(s) to their default values. Then you’ll be told that the restricted tunables exist solely for AIX development to tweak, and that these tweaks occur only in extraordinary cases where there’s a clear need to do so.
It may not seem like a big deal, but it is. Consider this scenario with our old unrestricted SCHEDO pal timeslice and a restricted tunable, smt_snooze_delay. As noted, timeslice is a benign CPU tuning attribute, and on AIX, any given timeslice has a value of 10 milliseconds (ms) that can be adjusted up or down. The possible range of timeslice values is huge: anywhere from zero to around 2 billion. You can tinker with timeslice to your heart’s content (in a test environment, of course) and you probably won’t cause your system any real harm.
But now let’s talk about our restricted tunable. The purpose of smt_snooze_delay is to adjust the amount of time a CPU spends idling without any useful work to do before it essentially goes to sleep and calls a hypervisor routine known as h_cede. This routine tells a virtual CPU with no useful work to do to enter a wait state and give―or cede―its capacity to another virtual CPU. I cover h_cede in this article on the Power hypervisor and kernel tracing.
This snooze time can be adjusted anywhere from 1 to 100 million ms. Nothing dangerous, so far. However, another flag, when added as the qualifier to smt_snooze_delay, effectively disables snoozing, or the capability of a CPU to idle. It forces the CPUs in your system to work constantly. This sounds great until you realize that there isn’t a storage subsystem or network adapter on the face of the planet that can keep up with a set of IBM Power Systems CPUs that have, for all intents and purposes, gone crazy. Disable snoozing and you’ll flood storage and networking with work they can’t possibly process. Your system will likely seize up like an engine without oil. In short, not a good career move.
The smt_snooze_delay restricted tunable exists so that the IBMers who write AIX can test things like timing and interrupt behavior. It has to be included in AIX and it can’t be made invisible. So, seriously, leave it. I say this with all earnestness: Under no circumstance should you ever disable snoozing in any of your systems. EVER. I hope I have your attention.
This is just one example. There are many more restricted tunables for CPUs, memory, networking and storage. Resetting any of them could have disastrous consequences on your system. All that said, it is worth your while to learn about these restricted dials and switches. Indeed, an in-depth knowledge of what they do under different circumstances will aid your understanding of how AIX works as a whole.
Study the individual help paragraphs on every tunable with the -h flag. Memorize them if you can. You’ll see how different workloads react to changes in tunable values. (And again, tinkering even with unrestricted tunables should only occur in a test environment.) Then place a Sev 4 (request for information) call to support and ask someone in Austin about any given tunable. If you reach a rep who has some time, you should get a thorough explanation. Prior to joining IBM, I spent years doing this sort of thing. It’s also a good way to build a relationship with support.
Putting it Into Practice
We’ve covered how to list all of the available tunables in your system, as well as how to get contextual help on what each does. We’ve noted the different classes of tunables―restricted and unrestricted―and made a case for or against their tuning. In the third installment in this series, we’ll look at how to adjust any tunable under different conditions, and go through some examples.