Skip to main content

Automating AIX Live Update with Ansible

IBM’s Chris Gibson explains how to perform AIX Live Update using Ansible with command examples

TechChannel Systems Management

I’ve written about my experiences with AIX Live Update, and the ability to update and patch your AIX servers without a reboot in the past. More recently I tested AIX Live Update (aka Live Kernel Update or LKU) with Ansible. Automating this powerful AIX function is advantageous, as it allows administrators to streamline the live update process, reduce manual errors, and increase operational efficiency by performing this type of operation at scale, unattended.

The IBM Power development team provided support for LKU in v1.9.0 of the IBM AIX Ansible collection. See links below. Here, I’ll outline the process.

Explore What's New: AIX Ansible collection version 1.9.0 available now!

lku – Performs live kernel update

module > lku

How to Perform AIX Live Update With Ansible

At the time of writing this article, the latest version of the AIX Ansible collection was 1.9.2. This is the version I installed on my Ansible controller.

# ansible-galaxy collection list ibm.power_aix

# /.ansible/collections/ansible_collections

Collection    Version

------------- -------

ibm.power_aix 1.9.2

I reviewed the provided lku demo playbook that came with the collection:

/.ansible/collections/ansible_collections/./ibm/power_aix/playbooks/demo_lku.yml

Note that currently this module only supports live update operations on AIX virtual machines (VMs) managed by PowerVC. The module uses the geninstall -k command for live kernel update.

I created my own Ansible playbook to test this new module. The aim of the playbook was to install an interim ifix (ifix) on my AIX server using the live update function. The playbook first copies the required ifix package to the AIX host(s) that needs to be patched. The ifix is copied from the /home/ifix directory on the Ansible controller, to the /tmp/ifixes directory on the AIX hosts. Next, the lku Ansible module is called to authenticate the AIX host(s) with the PowerVC server and then install the ifix with live update. The final step in the playbook is to remove the ifix from the /tmp/ifixes directory on the AIX host(s). The contents of my playbook are shown below.

lku_quark.yml:

--- 
- name: "Install ifix with Live Update on AIX"   
     hosts: aixhosts   
     gather_facts: false   
     collections:     
       ibm.power_aix   
     vars:     
       local_fixes_dir:  /home/ifix/     
       remote_fixes_dir: /tmp/ifixes     
       PVC_name: pvc1     
       PVC_password: abc123     
       PVC_user: root     
       directory: /tmp/ifixes     
       filesets_fixes: IJ45274s1a.230207.epkg.Z     
tasks:   
- name: find all fixes     
     ansible.builtin.set_fact:       
       fixes_list: "{{ fixes_list | default([]) + [ (item | basename) ] }}"                        with_fileglob:  
        - "{{ local_fixes_dir }}/*.epkg.Z"   
- name: create temporary directory for fixes on the target     ansible.builtin.file:       
     path: "{{ remote_fixes_dir }}"       
     owner: root       
     group: system       
     mode: 0700       
     state: directory   
- name: copy fixes to the target     
ansible.builtin.copy:       
     src: "{{ local_fixes_dir }}/{{ item }}"       
     dest: "{{ remote_fixes_dir }}/{{ item }}"       
     owner: root       
     group: system       
     mode: 0600     
loop: "{{ fixes_list }}"   
- name: To install all the updates and interim fixes that are available in the  "{{ directory }}" directory     
   lku:       
     PVC_name: "{{ PVC_name }}"       
     PVC_password: "{{ PVC_password }}"       
     PVC_user: "{{ PVC_user }}"       
     directory: "{{ directory }}"       
     filesets_fixes: "{{ filesets_fixes }}"     
   register: result   
- name: Display result     
  debug: var=result     
- name: remove fixes from the target     
ansible.builtin.file:       
     path: "{{ remote_fixes_dir }}/{{ item }}"       
     state: absent     
   loop: "{{ fixes_list }}"

I ran my live update (lku) playbook from my Ansible controller, as shown below.

# ansible-playbook lku_quark.yml

PLAY [Install ifix with Live Update on AIX] *******************************************************************************************

TASK [find all fixes] *****************************************************************************************************************
ok: [quark] => (item=/home/ifix/IJ45274s1a.230207.epkg.Z)

TASK [create temporary directory for fixes on the target] *****************************************************************************
ok: [quark]

TASK [copy fixes to the target] *******************************************************************************************************
ok: [quark] => (item=IJ45274s1a.230207.epkg.Z)

TASK [To install all the updates and interim fixes that are available in the  "/tmp/ifixes" directory] ********************************
[WARNING]: Module did not set no_log for PVC_password
ok: [quark]

TASK [Display result] *****************************************************************************************************************
ok: [quark] => {
    "result": {
        "changed": false,
        "cmd": "geninstall -k -d /tmp/ifixes IJ45274s1a.230207.epkg.Z",
        "failed": false,
        "msg": "Live Kernel Update operation has been performed successfully.",
        "rc": 0,
        "stderr": "",
        "stderr_lines": [],
        "stdout": "\n\n+-----------------------------------------------------------------------------+\n                    Pre-Live Update Verification...\n+-----------------------------------------------------------------------------+\nVerifying environment...done\nVerifying /var/adm/ras/liveupdate/lvupdate.data file...done\nComputing the estimated time for the live update operation...done\nResults...\n\nEXECUTION INFORMATION\n---------------------\n  LPAR: quark\n  PowerVC: 10.8.12.120\n  user: root\n\n  Blackout time(in seconds): 20\n  Total operation time(in seconds): 1155\n\n  << End of Information Section >>\n\n+-----------------------------------------------------------------------------+\n                    Live Update Requirement Verification...\n+-----------------------------------------------------------------------------+\n\nINFORMATION\n-----------\nINFO: Any system dumps present in the current dump logical volumes will not be available after live update is complete.\n\n  << End of Information Section >>\n\n+-----------------------------------------------------------------------------+\n                    Live Update Preview Summary...\n+-----------------------------------------------------------------------------+\nThe live update preview succeeded.\n\nNon-interruptable live update operation begins in 10 seconds.\n\nInitializing live update on original LPAR.\n\nValidating original LPAR environment.\n\nBeginning live update operation on original LPAR.\n\nRequesting resources required for live update.\n............\nNotifying applications of impending live update.\n....\nCreating rootvg for boot of surrogate.\n....................................................................\nStarting the surrogate LPAR.\n....................................................................................................................................................\nCreating mirror of original LPAR's rootvg.\n....................................\nMoving workload to surrogate LPAR.\n........\n\tBlackout Time started.\n\n\tBlackout Time end.\n\nWorkload is running on surrogate LPAR.\n....................................................................................\nShutting down the Original LPAR.\n............................\nThe live update operation succeeded.\n",
        "stdout_lines": [
            "",
            "",
            "+-----------------------------------------------------------------------------+",
            "                    Pre-Live Update Verification...",
            "+-----------------------------------------------------------------------------+",
            "Verifying environment...done",
            "Verifying /var/adm/ras/liveupdate/lvupdate.data file...done",
            "Computing the estimated time for the live update operation...done",
            "Results...",
            "",
            "EXECUTION INFORMATION",
            "---------------------",
            "  LPAR: quark",
            "  PowerVC: 10.8.12.120",
            "  user: root",
            "",
            "  Blackout time(in seconds): 20",
            "  Total operation time(in seconds): 1155",
            "",
            "  << End of Information Section >>",
            "",
            "+-----------------------------------------------------------------------------+",
            "                    Live Update Requirement Verification...",
            "+-----------------------------------------------------------------------------+",
            "",
            "INFORMATION",
            "-----------",
            "INFO: Any system dumps present in the current dump logical volumes will not be available after live update is complete.",
            "",
            "  << End of Information Section >>",
            "",
            "+-----------------------------------------------------------------------------+",
            "                    Live Update Preview Summary...",
            "+-----------------------------------------------------------------------------+",
            "The live update preview succeeded.",
            "",
            "Non-interruptable live update operation begins in 10 seconds.",
            "",
            "Initializing live update on original LPAR.",
            "",
            "Validating original LPAR environment.",
            "",
            "Beginning live update operation on original LPAR.",
            "",
            "Requesting resources required for live update.",
            "............",
            "Notifying applications of impending live update.",
            "....",
            "Creating rootvg for boot of surrogate.",
            "....................................................................",
            "Starting the surrogate LPAR.",
            "....................................................................................................................................................",
            "Creating mirror of original LPAR's rootvg.",
            "....................................",
            "Moving workload to surrogate LPAR.",
            "........",
            "\tBlackout Time started.",
            "",
            "\tBlackout Time end.",
            "",
            "Workload is running on surrogate LPAR.",
            "....................................................................................",
            "Shutting down the Original LPAR.",
            "............................",
            "The live update operation succeeded."
        ],
        "warnings": [
            "Module did not set no_log for PVC_password"
        ]
    }
}

TASK [remove fixes from the target] ***************************************************************************************************
changed: [quark] => (item=IJ45274s1a.230207.epkg.Z)

PLAY RECAP ****************************************************************************************************************************
quark                      : ok=6    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

The lku module handled all the steps required to perform a (PowerVC-based) live update operation. This included authenticating with the PowerVC server. I noted that before I ran the playbook the AIX host to be updated had not yet authenticated with the PowerVC server (which is something you would typically do with the pvcauth command, before starting live update). A few moments after starting the playbook, the host had successfully authenticated with PowerVC, and then the live update process began on the host; the geninstall -k -d /tmp/ifixes IJ45274s1a.230207.epkg.Z command was called successfully.

root@quark / # pvcauth -l

root@quark / # pvcauth -l

Address  : 10.8.12.120

User name: root

Project  : ibm-default

Port     : 5000

TTL      : 5:59:59

root@quark / #

Broadcast message from root@quark (pts/1) at 23:07:22 ...

Live AIX update in progress.

root@quark / # ps -ef | grep geninstall

    root  8716586  8978832   0 23:06:39  pts/1  0:00 /bin/ksh /usr/sbin/geninstall -k -d /tmp IJ45274s1a.230207.epkg.Z

PowerVC deployed the surrogate VM (with the ifix installed and activated) and the workload was live migrated from the original VM to the surrogate. When the live update operation was complete, the original VM was removed—all of which was expected with live update.

The live update operation took about 20 minutes to complete (which aligns with the estimate provided by the geninstall command, “Total operation time(in seconds): 1155”, shown in the ansible-playbook command output above). There was only a brief pause during the blackout window when the workload migrated to the surrogate VM (that is, “Blackout time(in seconds): 20”). There was no outage.

The PowerVC UI screenshots below show the deployment of the new surrogate VM, and the eventual removal of the original VM.

Figure 1. Deployment of the surrogate VM (in Building state).
Figure 1. Deployment of the surrogate VM (in Building state). (Click to expand)

Figure 2. Both the original and surrogate VMs during the live update operation, with workload migrating to surrogate.
Figure 2. Both the original and surrogate VMs during the live update operation, with workload migrating to surrogate. (Click to expand)

Figure 3. Live update operation is complete. The original VM has been removed.
Figure 3. Live update operation is complete. The original VM has been removed. (Click to expand)

The playbook performed exactly as I hoped it would. After the live operation was finished, the ifix was installed and active on my AIX VM (quark), without a reboot!

# uptime ; emgr -l

  12:36AM   up 108 days,  23:05,  3 users,  load average: 1.98, 1.92, 1.42

ID  STATE LABEL      INSTALL TIME      UPDATED BY ABSTRACT

=== ===== ========== ================= ========== ======================================

1    S    IJ45274s1a 08/08/24 23:19:41            Ifix for APAR IJ45274

STATE codes:

 S = STABLE

I look forward to seeing how the team further enhances this particular Ansible module for AIX live update.

References

Using Ansible for Automation with IBM Power Systems

Ansible on IBM Power Workshop

Simplifying AIX Live Update with PowerVC