Skip to main content

AIX Tips, Tricks and Useful Tools

Periodically I like to remind everyone about the tools they have at their disposal as well as providing a few tips and tricks to help with maintaining systems. In this article we will look at some of the tools you can use to help in your day-to-day work.

YUM and DNF

If you’re dealing with AIX toolbox packages like gzip or sudo or samba, a lot has changed over the past few years. Previously we used RPM to install these then we moved on to using YUM. YUM is far superior to just using RPM as it figures out all the pre and co requirements for you. More recently IBM has started to replace YUM with DNF. DNF is the Python v3 replacement for YUM. The install script for DNF is found on the Toolbox home page and it can be installed instead of YUM or to replace YUM. On opensource.com there is a quick guide available to DNF for YUM users. I have been systematically converting all my systems from YUM to DNF and it works really well.

Cloning rootvg

I’ve talked about this many times, but I want to stress that it’s much faster to revert from a maintenance run from a clone than it is to restore a mksysb. You should still always take a mksysb (to your NIM server or tape or whatever you use) when doing maintenance, but taking a clone gives you a very fast recovery as well. I always request two LUNs for rootvg when I build a system, even when it is on the SAN where it does not need to be mirrored. I usually ask for two 100GB or 150GB LUNs.

When I want to perform maintenance, I take a copy of the running rootvg as follows:

(let’s assume rootvg is on hdisk0 and the spare disk is hdisk1)

alt_disk_copy -B -V -d hdisk1

The -V means verbose output and the -B means do not change the bootlist

I would still check the bootlist using:

bootlist -m normal -o

The bootlist command above only shows the first five entries—if you’re mirrored or have lots of paths then you will not see them all.

If you want more detailed information on the bootlist you can add -v to use verbose mode:

bootlist -m normal -o -v

Finally, if you’re not sure which disk you actually booted from you can use the following command:

bootinfo -b

It will return the hdisk that was used at the last boot.

HMC Doesn’t Show Updated Version After an Upgrade

I’ve had times where my HMC was still showing the wrong release after I upgraded my VIO servers and AIX servers. I’ve seen this before and it usually comes right after a while, but not always. Gareth Coates has a support page that explains how to resolve this issue.

On the HMC you use the –osrefresh flag as follows:

lssyscfg  -m  servername  -r lpar -F os_version  --osrefresh

Replace servername above with the actual server name.

Gareth also provides a script that refreshes the oslevel for all LPARs on a server and that can be executed by using ssh to the HMC.

lsmpio Command 

The lsmpio command allows you to get details on the disk LUNs that can help you identify the disks correctly.

lsmpio -ql hdisk1
Device:  hdisk1
          Vendor Id:  PURE
         Product Id:  FlashArray
           Revision:  8888
           Capacity:  100.00GiB
      Volume Serial:  624A9370721735F697DB4ADA0019F933  (Page 83 NAA)

In the above I can see the volume serial which I can compare to the volume serial provided to me by the storage group—this way I can be certain I have the correct disk.

POWER9 LPAR Hangs on CA000040 Plus Secure Boot Issue

I’ve seen this on both AIX and VIO LPARs. If they haven’t been rebooted for 814 or more days and your firmware is back-level then the LPAR may hang with a CA000040 at reboot. This is described on the IBM website.

The fix is to change the LPAR or VIO server to POWER8 mode and reactivate it. After the firmware is updated you can then change the LPAR back to default mode and reactivate it so it comes back up in POWER9 mode. The good news is that VIO servers and AIX LPARs can run for more that 814 days (most recently the one I rebooted had been up for over 870 days). The bad news is you can get caught out with things like this. This is one of the many reasons that I recommend that LPARs and VIO servers get rebooted at least once every 6 months and that you schedule patching every six months (or sooner for critical patches).

Also note that for POWER9 FW940 and higher there’s a new firmware secure boot feature. If your boot adapters aren’t up to date on firmware the LPAR should still boot but you will get error messages with SRCs BA5400A5 or BA5400A6. This is documented in the server firmware FW940 readme, however, it’s not mentioned in the FW941 readme which is how I got caught out.

The error shows as fixed memory address alignment errors ending with the BA218003 error code although it may also show as BA2100001, BA2180001 or BA210003. This applies to all 16Gb and higher HBAs. So, please don’t forget to update your HBA firmware prior to upgrading your firmware unless the readmes for server and adapter firmware say otherwise.

HMCScanner

The tool I use the most (apart from nmon) has to be the HMCScanner. As soon as I get access to an environment, I run HMCScanner against every HMC so I have a good picture of the environment. I do this before and after every change I make to a server. A significant amount of information can be obtained using the HMCScanner if you’re using HMCs.

HMCScanner is a Java-based tool that connects to your HMC and documents everything on it that the HMC can see. It’s easy to install and can be run from Windows or AIX. On Windows I change into the directory and type in:

hmcScanner.bat  hmcname  hscroot -p password 

Substitute your HMC name for hmcname and the correct username and password

On AIX, there’s a ksh script you run with the same syntax.

The subsequent spreadsheet fully documents whatever the HMC can see, including virtual ethernets and the shared ethernet adapter (SEA). From the output you’ll get a list of all the servers that the HMC can see, including model and serial number, the server firmware levels, the HMC software level and the OS level for the LPARs and VIO servers. If LPARs are firewalled such that the HMC cannot communicate with them using RMC, then you will not see the OS level and you will also not be able to use DLPAR with those LPARs.

Once the HMCScanner report is run, you can then review firmware and maintenance levels and check for withdrawal dates, which will then allow you to prepare a plan for updates and a budget plan for replacements. The next step is to determine what’s currently out of service or about to be which brings us to FLRT.

FLRT, FLRTLITE and FLRTVC

IBM provides a very useful set of tools on their website. FLRT is the Fix Level recommendation tool which is an interactive tool you can use to check your current levels against. FLRTLITE provides lists of levels that you can manually compare to what you are running. The FLRT home page provide multiple tools under icons such as: report tools, data tables, scripting tools and apar tools.

Finally, FLRT vulnerability checker (FLRTVC) is a script that you install on the VIO or AIX LPAR to be checked. That script will download a file from IBM called apar.csv. It uses wget or curl to try to download a file called apar.csv from IBM and it then checks known issues against your software levels. The most common things it finds are back levels of SSH, SSL and Java. But it will also notify you of critical ifixes and efixes that you should review. If your server is unable to download the apar.csv file then you can download it yourself from the site. The downloaded file is called hiper.csv and you rename it to apar.csv. Then just edit the script and change SKIPDOWNLOAD=0 to 1. It will then read the local file.

I then run FLRTVC as follows:

./flrtvc.ksh >systemname-flrtvc.txt

The .txt file can be downloaded and opened in Excel. That file identifies the efixes and ifixes that need to go on, provides links to the readmes, and provides links to the actual download where possible.

Where to Get IBM Tools

Fix Central is used to get updates for server firmware, I/O firmware, AIX, VIO servers, HMCs, Spectrum Scale, Java and many other applications.

SSH and SSL patches are found at the IBM Web Download Site (which requires an IBM login). You select either OpenSSH or OpenSSL and then continue and it takes you to the latest downloads. This is also where you can download the latest versions of xgzip and other software.

The AIX Linux Toolbox is used to download open-source software that’s typically installed with RPM, YUM or DNF.

Backing up Your HMC

Prior to performing maintenance on my HMC or to applying firmware to servers I always back up my server profiles and my HMC.  I don’t just backup to USB—I also back up to my NIM server (or another LPAR) so I have a remote copy.

Below are some examples, assuming the server names on the HMC are 9009-22a-serial1 and 9009-22a-serial2 and the NIM server is 192.168.2.70 with a backup filesystem /backups/hmcsave and /backups/hmcbkup. The directories are owned by my jlynch userid.

First I backup the profile data:

bkprofdata -m 9009-22a-serial1 -f backup-server1-dec062021
bkprofdata -m 9009-22a-serial1 -f backup-server2-dec062021

Then I do a saveupgdata to local disk and remote NIM server:

saveupgdata -r disk
saveupgdata -r diskftp -h 192.168.2.70 -u jynch -d /backups/hmcsave

Finally I take a remote backup to the NIM server:

bkconsdata -r ftp -h 192.168.2.70 -u jlynch -d /backups/hmcbkup

In the above you can use disksftp and sftp if you prefer them to diskftp and ftp.

It’s good practice to regularly back up your profiles and your HMC to a remote system. I also keep a copy of the recovery iso on my NIM server as I can load that to the new POWER HMCs as virtual media if I have problems.

nmon

No article on tools would be complete without mention of nmon and the nmon analyser. nmon has been part of the OS since AIX 6.1 and is used by admins all over the world to gather performance data. I like to run it using the following flags so that I don’t miss anything:
nmon -ft -AOPV^dML -s 15 -c 120

The above takes a 30-minute snapshot (120 x 15 second snaps) and includes asynchronous IO (A), the SEA (O) paging (P), volume groups (V), fibre adapter statistics (^), disk service times (d), memory pages (M), and large pages (L). If you’re running Workload Manager (WLM) then you can add the W flag as well. I run nmon all the time so my normal cron job runs for 24 hours using “-s 150 -c 576”. There is also a version of nmon for Linux—the flags are slightly different, but it still provides very valuable information.

nmon Analyser and nmon Visualizer

nmon analyser goes hand in hand with nmon. It’s an Excel spreadsheet that processes nmon files and produces graphs of what’s happening. It’s not a total performance monitoring solution, but it provides some valuable information for day to day performance work. I supplement this with my own data gathering scripts that gather more in-depth data.

nmon visualizer is a Java GUI that can be used to analyse one or more .nmon files. It also parses IOStat files, IBM verbose GC logs, Windows Perfmon and ESXTop CSV data and JSON data. It allows you to drill down in a more visual manner. Another option for processing nmon files is nmonchart.

njmon

njmon is similar to nmon, but it saves to a JSON format or can be saved directly into InfluxDB. It can be used on AIX, VIO servers and Linux. It collects more performance and configuration data than nmon and is particularly useful if you want to do near real-time graphing.

loopmount Command

The loopmount command is an incredibly useful command that allows you to mount iso images in an LPAR. I have a habit of losing install DVDs so I use software to rip them into .iso files and then I upload those .iso files to my NIM server where I store them in an NFS exported directory. That way, from any LPAR I can grab the iso and mount it as if it was on a DVD drive. If you’re uploading multiple images give them meaningful names so that you can figure out what they actually are later. I now download all my images from fix central as .iso files wherever I can and use smitty bffcreate to then create the install directory for NIM or just generically.

Here’s an example of how to mount an iso image—assuming it’s called aix71-tlo4sp1-cd1.iso and that it’s in /isoimages and I have a mount point called /isomnt:

loopmount -i /isoimages/aix71-tl04sp1-cd1.iso -o "-V cdrfs -o ro" -m /isomnt

I can now use ls on /isomnt and it is as if the CD itself was mounted.

No Shortage of AIX Tools 

As you can see there are many tools out there that can help streamline day to day activities. This article only covers a small number of those tools. The virtual user groups and Nigel’s AIXPert blog are great places to go to find more information on available tools. Additional information and links can be found at the IBM Support Portal.

References