AIO: The Fast Path to Great Performance
Asynchronous input and output (AIO) is an essential performance feature of AIX. Without it, our world would be a much slower place.
Most AIX administrators are familiar with the concept of asynchronous input and output (AIO). AIO is coded into many operating systems, including most UNIX and Windows versions. You don’t hear much about AIO, I guess because it’s been around for so long it’s somewhat taken for granted. It lacks the appeal of a flashy GUI, blazing response times or massive amounts of memory. Rest assured though, without AIO, our world would be a much slower place.
AIO is an essential performance feature of AIX; it’s been incorporated into the operating system for decades and is one of the first tunables newbie administrators learn to adjust. But to really see how AIO works in AIX and why it’s so important, we need to understand some concepts and terminology.
The Basics
Essentially, AIO is input and output (or read and write) processing that allows still other processing to commence before the first I/O has completed.
Say you have a database that has created two threads, each of which initiates some form of I/O activity in your system. In a system without AIO, the first thread would carry out its I/O while the second thread would be forced to wait in a queue. Only when the first thread’s I/O was complete could the second thread’s I/O run.
This is an example of synchronous I/O―though programmers typically refer to it as “blocking I/O.” Back in the dark ages of computing systems before multitasking became an essential programming feature, this was how I/O was handled, and the systems that utilized it were s-l-o-w. Not only did I/O processing occur at a glacial pace, the net result was that most system resources―disk, CPU, memory and network―were idle the majority of the time. Think about it: If system function must be held up while your database does a simple read or write, and you’re doing thousands of those reads and writes, most of your time is going to be spent transitioning and readying the system to do the next I/O (and the next…). At a time when memory and CPUs were very expensive items, this I/O scheme did not produce the greatest bang for the IT investment dollar.
Fortunately, somewhere along the way a truly brilliant programmer(s) came up with code that allowed that second database thread to initiate its own I/O operation before the first thread had completed its I/O. And as the concept and the code behind it evolved, it became possible for computing systems to carry out hundreds and even thousands of I/O operations simultaneously without holding up other processing. Broadly speaking, this is “asynchronous” operation, and when applied to storage (and networking, incidentally), we call it AIO.
An aside: AIX is a bit confusing with its storage acronyms. We have AIO, but we also have CIO and DIO. We even have an error condition called EIO. Understandably, this can confuse beginning admins. Ultimately, though, it’s simple: AIO is an I/O method, while DIO (direct I/O) and CIO (concurrent I/O) are ways to mount filesystems. EIO, as I said, is an error.
Also, before we go any further, I need to talk about the elephant in the room: Unless your application code specifically takes advantage of AIO, you won’t be able to use this facility. These days, most major database vendors’ products utilize AIO, including Oracle, DB2 and Cache. But with applications―particularly homegrown applications and those produced by small vendors―it’s pretty much a crapshoot. To determine if an app does use AIO, just take an AIX kernel trace and examine the data. This is easy—plus it’s the only way to tell on your own if you have both a database and your app installed in the same system. Suffice it to say that, when in doubt, it’s best to contact your application vendor and talk directly to the developers. First line support almost never has this information.
Fastpath and Slowpath AIO
So how is AIO implemented in AIX? There are two implementation methods: fastpath and slowpath. In fastpath AIO, asynchronous I/O is handled through adapter driver software and/or Logical Volume Manager (LVM) code. In slowpath AIO, a kernel thread that’s inserted into the I/O stream carries out the operation. This illustrates the difference in naming. If you have an additional hop for your I/O request to get from point A to point B, it’s going to be slower compared to the more direct route.
Fastpath and slowpath AIO can be enabled in your AIX system simultaneously. Which method you’ll use depends on your application or database programming code. In early AIX versions, AIO fastpath had to be specifically enabled, but these days it’s on by default. And every version of AIX as far back as I can remember always configured some number of AIO kernel threads to handle slowpath AIO, and this hasn’t changed much in more than two decades. The upshot is you’ll be covered either way.
Of course, nothing in UNIX is ever as straightforward as it seems. When we utilize kernel threads to handle AIO in AIX, we need to remember that there are two types, and we must also know which type our application or database uses. The AIO kernel thread types are “legacy” and “POSIX.” While they’re the same from a functional standpoint, they’re implemented into code in slightly different ways. (For our purposes these differences are negligible and we needn’t spend time on them. But if you really want to know, it’s spelled out online in the IBM Knowledge Center. Don’t say I didn’t warn you: this material is migraine-inducing.)
Anyway, AIX provides the option to alter the quantity of each type of kernel thread. Each consumes some memory for its process environment―about 440K―so if you’re on a low-memory system, I wouldn’t configure thousands of them. So how many of these AIO kernel threads―commonly called AIO servers―will you need? As root at an AIX command prompt, have a look at the input and output options (IOO) with this command:
ioo –FL | more
You’ll get this output:
NAME CUR DEF BOOT MIN MAX UNIT TYPE
DEPENDENCIES
--------------------------------------------------------------------------------
aio_active 0 0 boolean S
--------------------------------------------------------------------------------
aio_maxreqs 128K 128K 128K 4K 1M numeric D
--------------------------------------------------------------------------------
aio_maxservers 30 30 30 1 20000 numeric D
aio_minservers
--------------------------------------------------------------------------------
aio_minservers 3 3 3 0 20000 numeric D
aio_maxservers
--------------------------------------------------------------------------------
aio_server_inactivity 300 300 300 1 86400 seconds D
These fields tell you the status of legacy AIO in your system. (The POSIX AIO tunables come later in the display.) Top to bottom, here’s what these fields are telling you:
- aio_active―This shows whether the AIO kernel extension has actually been utilized in your system and pinned. (It’s been made not-pageable in memory.)
- aio_maxreqs―This is the number of AIO requests that are in progress, as well as those that are waiting in queues. While imprecise, it’s easiest to think of total number of AIO requests that can be this value as the
- aio_maxservers―This is number of legacy AIO kernel threads or servers that are configured for AIO slowpath, on a per-CPU basis. The default is 30 such servers.
An interesting bit of trivia: In very early AIX versions, this value was initially calculated on a per-CPU basis. Along about the late 1990s, as I recall, the philosophy shifted and you allocated your AIO servers according to the number of disk drives you had. The rule was ten times the number of drives that were accessed asynchronously. After another few years, we went back to the per-CPU value.
If you’ve read my articles on CPU usage, you can see that determining the number of AIO servers based on your CPU count can be dicey. Are we talking about entitled or virtual CPUs? Suppose your virtual-to-entitled CPU ratio is above the performance best practice of 1:1? What if it’s 2:1 or 3:1? What if it’s the 20:1 ratio that’s allowed in POWER8 systems? And are those CPUs operating in raw or scaled throughput modes, meaning should you calculate your AIO server needs based on one hardware thread’s utilization, or all of them? The answers to these questions are by no means as cut and dried as the man pages make out. My advice? If you have the available memory, use the 10X hdisk value to determine a baseline for the number of AIO servers. With most heavily used databases, regardless of the vendor, you’ll probably find the default value to be inadequate. If you’re planning for AIO in one of these environments, start with an aio_maxservers value of 10X the number of disks that will be involved in AIO and adjust upwards from there, if needed.
- aio_minservers―This is the minimum number of AIO servers you can have in your system. My best advice is to never alter the default value of 3.
- aio_server_inactivity―After 300 seconds, if an AIO server has found no useful work to do, it will exit the process table. Another value you should probably leave be.
The POSIX AIO server tunables appear a few lines down in your IOO -FL output and are duplicates of the legacy tunables. Once you’ve determined whether you’re doing legacy or POSIX AIO, tune these values the same way.
The legacy and POSIX AIO tunables I’ve just covered are non-restricted tunables, meaning that they can be freely changed by a site administrator. Some restricted tunables also bear mention, however, as their proper usage clears up a few common technical misconceptions. Most of these restricted tunables govern the use of AIO fastpath. Keep in mind that if your database or application uses slowpath, adjusting these tunables yields no benefit; in fact, you could harm your system. So let’s look at these tunables as information-only dials and switches.
Again, here is part of our IOO –FL output:
##Restricted tunables
--------------------------------------------------------------------------------
aio_fastpath 1 1 1 0 1 boolean D
--------------------------------------------------------------------------------
aio_fsfastpath 1 1 1 0 1 boolean D
--------------------------------------------------------------------------------
aio_kprocprio 39 39 39 0 254 numeric D
--------------------------------------------------------------------------------
aio_multitidsusp 1 1 1 0 1 boolean D
--------------------------------------------------------------------------------
aio_sample_rate 5 5 5 1 86400 seconds D
--------------------------------------------------------------------------------
aio_samples_per_cycle 6 6 6 1 128K numeric D
From the top:
- aio_fastpath―This displays the current status of the AIO system and how AIO requests are managed. The default for this tunable is 1, meaning fastpath is enabled. The default sends all AIO requests directly to the LVM or disk code, bypassing the Virtual Memory Manager (VMM) and filesystem. In doing so, no caching is done at that level. Turning off fastpath routes AIO requests to the kernel processes. However, if AIO is running to a filesystem, the fastpath state doesn’t matter because calls will go to the VMM. AIO to a raw logical volume (LV) with fastpath disabled results in the raw LV being treated as a special file, passing its requests through the VMM and filesystem code.
- aio_fsfastpath―When mounting your JFS2 filesystems with the CIO―or “concurrent”―option, this setting determines how I/O requests are made. With the default value of 1, the I/O request is routed directly to the LVM or disk. Setting this tunable to 0 disables the filesystem fastpath and routes I/O requests to the kernel processes.
- aio_kprocprio―This controls the priority of AIO kernel processes. Essentially, it controls their “nice” value.
- aio_multitidsusp―This determines whether threads can suspend execution on an AIO request by another thread. This value only applies to legacy AIO servers.
- aio_sample_rate and aio_samples_per_cycle―These days, AIO servers are dispatched from pools of such servers. These two tunables govern how a mechanism called “decay” is managed in these pools. Pretend these last two options don’t exist.
And like the unrestricted tunables, these values are duplicated for the POSIX AIO system. My best advice is to tune the restricted settings only after extensive consultation with your database or application vendor. As a general rule, you shouldn’t need to touch them.
So What AIO Type Are You Using?
Now that we’ve defined AIO and explained the types of AIO that are available in AIX, you might be thinking, “In my existing system, how can I tell what kind of AIO I’m using, if any?” In a system with a single purpose―i.e., a database server or one that runs a single application―the answer is easy: use pstat. You could use a regular ps command with the appropriate flags (-elfk, for example), but with pstat, you only need to remember the -a flag.
Let’s look at two different outputs from a pstat –a, grepping for aioserver (kernel process) activity:
pstat –a | grep aio
53 a 35007c 1 35007c 0 0 1 aioLpool
55 a 37009c 1 37009c 0 0 1 aioPpool
Here we see only two entries: the pools from which AIO servers are dispatched. We can infer from this output that we’re either using AIO fastpath―in which case we don’t use AIO servers―or we’re not doing any form of AIO.
Now look at the same command, run on a different system:
pstat –a | grep aio
1088 a 4001ac 1 4001ac 0 0 1 aioPpool
1133 a 6d01f2 1 6d01f2 0 0 1 aioLpool
1286 a 1060150 1 1060150 204 204 1 posix_aioserver
1357 a 14d01aa 1 14d01aa 204 204 1 posix_aioserver
1480 a 1c8010c 1 1c8010c 0 0 1 posix_aioserver
1497 a 1d901fc 1 1d901fc 0 0 1 posix_aioserver
…lines omitted …
In this case we see not only our AIO pools but also the AIO servers that have been dispatched from those pools. If we have active AIO servers, we’re using AIO slowpath for at least some portion of our total workload.
Things get a little more complicated if we’re running multiple applications in one system. Again, let’s take the example of a system with both a database and an app of some sort. It’s a good bet the database is doing AIO, but what about the app? Depending on how granular you want to get, you have two choices: an AIX kernel trace, or iostat. I usually use kernel traces to tell me which entity is doing what kind of AIO, but there’s a middle road: Use iostat with the -A and -P flags. This is far easier to implement and a whole lot easier to read. The –A flag will tell us about legacy AIO sever activity while –P tells us about POSIX AIO server activity. Other fields in the output will show us AIO fastpath activity, if any:
iostat –A |more System configuration: lcpu=40 drives=834 ent=10.00 paths=2 vdisks=0 maxserver=1200 aio: avgc avfc maxgc maxfc maxreqs avg-cpu: % user % sys % idle % iowait physc % entc iostat: 0551-157 Asynchronous I/O not configured on the system.
Here we’re looking for legacy AIO activity. We get a message that says, “Asynchronous I/O not configured on the system.” So we know if we’re doing slowpath, we’re not using legacy kernel processes. But now we issue the same command, but with the –P flag:
iostat –P | more
System configuration: lcpu=40 drives=834 ent=10.00 paths=2 vdisks=0 maxserver=1200
aio: avgc avfc maxgc maxfc maxreqs avg-cpu: % user % sys % idle % iowait physc % entc
485.5 0.0 79 0 131072 61.7 8.8 27.2 2.3 5.6 111.2
Disks: % tm_act Kbps tps Kb_read Kb_wrtn
hdisk17 0.0 2.3 0.3 3010263 401879
hdisk16 0.0 2.1 0.3 2544751 574347
hdisk15 0.0 2.2 0.3 2838471 454843
hdisk14 0.0 2.3 0.3 2904307 489979
Aha! This is different. In this output, we see counters in a few fields plus drive activity. It’s the drive activity that tells us POSIX AIO is active in this system. So remember, where you see disk activity, that’s the AIO slowpath type you’re using―legacy or POSIX. What about fastpath AIO? Look to the first few lines of your iostat output. See the line that begins with aio:? That’s where you can see if you’re using fastpath. Here are the pertinent fields:
- avgc―Displays the average fastpath request count, per second.
- maxfc―Shows the maximum fastpath request count since the value was previously displayed. Positive values in the avgc and maxfc fields mean you’re using fastpath AIO.
You’ll generally see fastpath AIO used if you have AIO going to filesystems that are mounted CIO or DIO (the former is a hybrid of the latter) and to raw logical volumes. Slowpath AIO will happen in filesystems that are mounted normally and use the filesystem cache. Beyond this, to pin down the exact type of AIO used in systems with mixed database and application environments, you’ll need to run an AIX kernel trace. Kernel tracing for AIO is a topic which requires an article of its own, and in the coming months, we’ll take this next and far more advanced step.
In the meantime, I’ll address second elephant in the room: input and output completion ports (IOCP).
A Quick Look at IOCPs
Ever wonder about these things? IOCPs have shipped with AIX distributions since at least v4.1. To me, they’re essential. If you run lsdev –Cc iocp, it’ll return this line: iocp0 Defined I/O Completion Ports. The IOCPs are always in a defined state, so you’ll need to set them to available. But how do they relate to AIO?
An IOCP is an API that carries out multiple simultaneous I/O operations in many different operating systems. It’s created and associated with file handles and sockets. (Remember when I said AIO worked for both storage and networking?) Without getting too weedy, an IOCP is a structure that is notified every time an I/O operation completes. What happens is a process that requests I/O checks the completion port for a message that says (essentially), “You’re done, go ahead and do some more I/O.” The completion ports and their messaging scheme impose an order―and speed―on AIO that you wouldn’t have without them. Strictly speaking, you don’t have to configure IOCP in systems where you will do AIO. But this messaging facility increases the performance of your AIO operations, so I always opt to flip the IOCP to the available state. As noted, there are AIO fastpath instances where IOCPs don’t help. Raw data, after all, isn’t associated with file handles.
Did you get that? Yes, I’m saying that raw data that uses fastpath AIO might not benefit from IOCPs. But what if that raw data is then written to a socket? Short answer: Always configure your IOCPs in an available state.
Why AIO Matters
While we’ve covered a lot of ground, there’s still plenty more to say about AIO. For now though, I’ll leave you with this thought: Without an understanding of the AIO mechanism, getting the best possible storage―or network―performance from your AIX systems is impossible. If you never study another I/O topic, make sure you study this one.