Skip to main content

New lsmpio Command Provides Better View of MPIO

The lsmpio command and the -U flag for the chdev command make AIX highly available and capable of undergoing dynamic system changes.

A laptop with blue graphics on the screen including a globe. Either side of the laptop has binary 0s and 1s in vertical lines.

A recent IBM developerWorks article, IBM AIX MPIO: Best practices and considerations, discussed ways of ensuring you have an efficient MPIO configuration on your AIX systems. I highly recommend you take the time to read this article in full. It also introduced some new features to the AIX operating system that I thought were worth exploring and discussing further.

The authors introduced us to a new command (in AIX 7.1 TL3 and 6.1 TL9), called lsmpio. This command displays information about the MPIO storage devices on AIX. The default output provides a very similar view of your MPIO configuration to that produced by the (existing) lspath command. When I ran the lsmpio (and lspath) commands on my system, I saw the following output:

# oslevel -s
7100-03-01-1341

# lsmpio
name    path_id  status   path_status  parent  connection

hdisk0  0        Enabled  Clo          vscsi0  810000000000
hdisk1  0        Enabled  Sel          vscsi0  820000000000

# lspath
Enabled hdisk0 vscsi0
Enabled hdisk1 vscsi0

Immediately I saw that lsmpio was providing me with a lot more information than lspath. I now have access to extended status information, such as whether or not a device (disk) is closed or open (and selected for I/O operations). The possible values for the extended status field (path_status) are:

  • Opt - Indicates that the path is an optimized path. This value indicates a path that attaches to a preferred controller in a device that has multiple controllers. The PCM selects one of the preferred paths for I/O operations, whenever possible.
  • Non - Indicates that the path is a non-optimized path. On a device with preferred paths, this path is not considered as a preferred path. The PCM avoids the selection of this path for I/O operations, unless all preferred paths fail.
  • Act - Indicates that the path is an active path on a device that has active and passive controllers. The PCM selects active paths for I/O operations on such a device.
  • Pas - Indicates that the path is a passive path on a device that has active and passive controllers. The PCM avoids the selection of passive paths.
  • Sel - Indicates that the path is being selected for I/O operations, for the time when the lsmpio command is to be run.
  • Rsv - Indicates that the path has experienced an unexpected reservation conflict. This value might indicate a usage or configuration error, with multiple hosts accessing the same disk.
  • Fai - Indicates that the path experienced a failure. It’s possible for a path to have a Path Status value of Enabled and still have an Extended Status value of Fai. This scenario indicates that operations sent on this path are failing, but AIX MPIO has not marked the path as Failed. In some cases, AIX MPIO leaves one path to the device in Enabled state, even when all paths are experiencing errors.
  • Deg - Indicates that the path is in a degraded state. This scenario indicates that the path was being used for I/O operations. Those operations experienced errors, thus causing the PCM to temporarily avoid the use of the path. Any additional errors might cause the path to fail.
  • Clo - Indicates that the path is closed. If all paths to a device are closed, the device is considered to be closed. If only some paths are closed, then those paths might have experienced errors during the last time the device was opened. The AIX MPIO periodically attempts to recover closed paths, until the device path is open.

The command has several useful options you can pass to it. For instance, the –S flag provides some interesting statistics and counters for hdisk devices. For example, you can quickly determine if any errors have been recorded for a device.

# lsmpio -l hdisk1 -S
Disk: hdisk1
    Path statistics since Wed Jan 15 10:52:33  2014
    Path 0: (vscsi0:820000000000)
        Path Selections:                               490996
        Adapter Errors:                                     0
        Command Timeouts:                                   0
        Reservation Conflicts:                              0
        SCSI Queue Full:                                    0
        SCSI Busy:                                          0
        SCSI ACA Active:                                    0
        SCSI Task Aborted:                                  0
        SCSI Aborted Command:                               0
        SCSI Check Condition:                               0
        Last Error:                                       N/A
        Last Error Time:                                  N/A
        Path Failure Count:                                 0
        Last Path Failure:                                N/A
        Last Path Failure Time:                           N/A

Note: The lsmpio command works with AIX MPIO storage devices only. I encourage you to read the documentation for this command on the AIX Information Center to learn more.

Another feature mentioned in the IBM article was the new –U flag for the chdev command. The article states, “For the newest technology levels of AIX (at the time of publishing this article), some disk attributes on some devices support the -U flag on the chdev command. This flag instructs chdev to attempt a dynamic update of the attribute value. With this flag, the attribute value can be changed without closing the disk and the change takes effect immediately.”

And from the AIX chdev man page: “-U: Changes the characteristics of the device while allowing the device to remain in the Available state. This flag cannot be used with the -P or -T flag. Not all devices and attributes support the -U flag.”

To support this new capability, the output from the lsattr command has also been updated. Attributes that can be changed dynamically (with the -U option) will have an added plus sign (+) on the user changeable field output from the lsattr command. I verified this on my lab system (running AIX 6.1 TL9). This system was connected to an XIV storage system. Sure enough, I discovered several user-changeable options now displayed True+ rather than True.

# lsdev –Cc disk | grep hdisk15
hdisk15          Available   MPIO 2810 XIV Disk

# lsattr -El hdisk15
attribute       value                       description                      user_settable

PCM             PCM/friend/fcpother         Path Control Module              False
PR_key_value    none                        Persistant Reserve Key Value     True+
algorithm       round_robin                 Algorithm                        True+
clr_q           no                          Device CLEARS its Queue on error True
dist_err_pcnt   0                           Distributed Error Percentage     True
dist_tw_width   50                          Distributed Error Sample Time    True
hcheck_cmd      inquiry                     Health Check Command             True+
hcheck_interval 60                          Health Check Interval            True+
hcheck_mode     nonactive                   Health Check Mode                True+
location                                    Location Label                   True+
lun_id          0x9000000000000             Logical Unit Number ID           False
lun_reset_spt   yes                         LUN Reset Supported              True
max_coalesce    0x40000                     Maximum Coalesce Size            True
max_retry_delay 60                          Maximum Quiesce Time             True
max_transfer    0x80000                     Maximum TRANSFER Size            True
node_name       0x5001738000510000          FC Node Name                     False
pvid          00f62768504e28790000000000000000  Physical volume identifier       False
q_err           yes                         Use QERR bit                     True
q_type          simple                      Queuing TYPE                     True
queue_depth     40                          Queue DEPTH                      True
reassign_to     120                         REASSIGN time out value          True
reserve_policy  no_reserve                  Reserve Policy                   True+
rw_timeout      30                          READ/WRITE time out value        True
scsi_id         0x10200                     SCSI ID                          False
start_timeout   60                          START unit time out value        True
timeout_policy  retry_path                  Timeout Policy                   True+
unique_id       2611200173800005102BE072810XIV03IBMfcp Unique device identifier  False
ww_name         0x500173800051019           FC World Wide Name               False

# lsattr -El hdisk15 | grep True+
PR_key_value    none                        Persistant Reserve Key Value     True+
algorithm       round_robin                 Algorithm                        True+
hcheck_cmd      inquiry                     Health Check Command             True+
hcheck_interval 60                          Health Check Interval            True+
hcheck_mode     nonactive                   Health Check Mode                True+
location                                    Location Label                   True+
reserve_policy  no_reserve                  Reserve Policy                   True+
timeout_policy  fail_path                   Timeout Policy                   True+

If you are using non-IBM storage you may find that these options cannot be changed dynamically (and will not display True+). Devices running the AIX-supplied ODM should have several attributes that are changeable. Note, that at the time of writing, the ODM entries for VSCSI disk devices had not been updated to support this new feature.

I attempted to change one of the attributes using the –U flag. I changed the timeout_policy attribute from retry_path to fail_path . You’ll observe below that when I didn’t specify the –U option, my change was rejected as the device was busy.

# lsattr -El hdisk15 -a timeout_policy
timeout_policy retry_path Timeout Policy True+

# lsattr -Rl hdisk15 -a timeout_policy
retry_path
fail_path
disable_path

# chdev -l hdisk15 -a timeout_policy=fail_path
Method error (/usr/lib/methods/chgdisk):
        0514-062 Cannot perform the requested function because the
                 specified device is busy.

# chdev -l hdisk15 -a timeout_policy=fail_path -U
hdisk15 changed

# lsattr -El hdisk15 -a timeout_policy
timeout_policy fail_path Timeout Policy True+

Note: It took almost 2 minutes to update the device attribute. I also found that some attributes on one of my SAS disks also appeared to have the concurrent update option.

# lsdev –Cc disk | grep –w hdisk0
hdisk0           Available   SAS Disk Drive

# lsattr -El hdisk0 -attr
attribute       value                             description            user_settable

PCM             PCM/friend/scsiscsd                Path Control Module           False
algorithm       fail_over                          Algorithm                     True+
dist_err_pcnt   0                                  Distributed Error Percentage  True
dist_tw_width   50                                 Distributed Error Sample Time True
hcheck_interval 0                                  Health Check Interval         True+
hcheck_mode     nonactive                          Health Check Mode             True+
max_coalesce    0x10000                            Maximum Coalesce Size         True
max_transfer    0x100000                           Maximum TRANSFER Size         True
pvid            00f627686c0f58f40000000000000000   Physical volume identifier    False
queue_depth     16                                 Queue DEPTH                   True
reserve_policy  no_reserve                         Reserve Policy                True
size_in_mb      146800                             Size in Megabytes             False
unique_id       2A1135000C5003BE0FDB70BST9146852SS03IBMsas Unique device identifier      False
ww_id           5000c5003be0fdb7                   World Wide Identifier         False

It’s good to see that there is continued work underway to enhance the AIX operating system. Both of these new features go a long way to making the life of an AIX administrator a lot easier and ultimately making AIX a highly available OS that can undergo dynamic system changes, avoiding the need for scheduled outages.

        
Webinars

Stay on top of all things tech!
View upcoming & on-demand webinars →