Introduction
Minimizing downtime is critical for IBM AIX environments running mission-critical workloads. Live Kernel Update (LKU) enables administrators to apply kernel patches without rebooting, though traditional processes may still result in brief service interruptions. With AIX 7.3 TL3 SP1, IBM delivers major LKU enhancements: reducing blackout times, accelerating updates, and improving responsiveness for high-throughput and complex environments. In this blog, we’ll examine these key innovations, present performance benchmarks, and discuss how they help IT teams sustain uptime without compromise.
AIX LKU performance improvements in AIX 73 TL3 SP1
IBM has introduced significant enhancements to the AIX Live Kernel Update (LKU) process aimed at reducing blackout periods and overall update times.
- Key improvements include parallelizing traditionally sequential commands such as varyonvg and swapon during blackout phases, which drastically cuts down execution time by running these operations concurrently in the background.
- Additionally, new LKU-optimized options have been added to critical commands like varyonvg, synclvodm, redefinevg, and importvg to skip redundant steps and avoid resource contention, further enhancing performance.
- Filesystem mounting and remounting has been completely reengineered by replacing the slower traditional fork/exec sequences with advanced multi-threaded techniques and direct system calls, resulting in dramatically faster and more efficient mounting and remounting of filesystems, significantly boosting performance and reducing downtime during updates.
- Trace collection is now disabled by default. In case if enabled then trace collection for lightweight memory and component traces has been significantly enhanced to be more intelligent and efficient by employing compression techniques that dramatically accelerate file transfers, resulting in faster diagnostics with minimal system impact.
- The callout scripts in the AIX Live Update process are used to execute custom or additional commands during the update, typically to handle specific tasks or system configurations that need to be addressed as part of the live update workflow. This callout scripts framework has been refined by streamlining the script invocation logic. This reduces overhead and improves efficiency, thereby making the overall update process faster.
- Fine tuned LKU and blackout time estimation methodology resulting in more accurate predictions, helping administrators to plan updates better.
- Together, these enhancements deliver a more streamlined, faster, and reliable live update experience for AIX users on AIX 7.3 TL3 SP1 and up.
AIX LKU performance using in-house Oracle OLTP Workload: AIX 7.3 TL1 SP2 vs TL3 SP1
To validate the impact of the architectural and command-level enhancements introduced in AIX 7.3 TL3 SP1, IBM conducted performance benchmarking using an Oracle OLTP workload; a representative scenario for high-throughput, latency-sensitive enterprise environments.
Test Configuration
- HMC: V10R3M1061
- PowerVC: 2.2.1.2
- Firmware: MM1060_107
- VIOS: 4.1.0.10
- AIX Versions: 7.3.1.2 (TL1 SP2) vs 7.3.3.1 (TL3 SP1)
- Hardware: 8 vCPU / 8 PU, 240 GB RAM
- Storage: 177 disks (150 GB each), 475 GB rootvg
- Workload: LPAR is running 5 instances of Oracle with a workload to drive the cpu utilization to 85%
LKU Time Comparison
The benchmarking results show a significant reduction in total LKU time when moving from TL1 SP2 to TL3 SP1.
Test Results (LKU Time):
- TL1 SP2: ~65 minutes
- TL3 SP1: ~29 minutes
- → ~55% reduction
This improvement is attributed to:
- Parallel execution of volume group and swap operations
- Optimized command paths that skip redundant steps
- Faster remounting of filesystems
Figure 1. LKU Performance Comparison: AIX 7.3 TL1 SP2 vs TL3 SP1 — Average Duration Reduced by Over 50%
Blackout Time Comparison
The blackout window (a critical phase in terms of service disruption) also saw a substantial decrease in duration.
Test Results (Blackout Time):
- TL1 SP2: ~144 seconds
- TL3 SP1: ~35 seconds
- → ~75% reduction
This is a direct result of:
- Multi-threaded remounting
- Asynchronous system recovery tasks
- Reduced overhead in callout script execution
Figure 2. LKU Blackout Time Comparison: AIX 7.3.1.2 vs 7.3.3.1 — 4X Reduction in Average Duration Under Oracle OLTP Workload.
Figure 3. Measured Improvements in LKU and Blackout Times: AIX 7.3 TL1 SP2 vs TL3 SP1.
Summary
The performance benchmarks demonstrate that AIX 7.3 TL3 SP1 delivers measurable and meaningful improvements in both LKU and blackout times. These gains are especially valuable in environments with large-scale storage, complex volume group configurations, or high availability requirements.
Authors
Vinod Boddukuri (AIX Live Update Architect), email: vinod.boddukuri@in.ibm.com
Kanchana Parepalli (AIX Live Update Development), email: kanchana.parepalli26@ibm.com
Barenya Nandy (AIX Live Update Development), email: bknandy@ibm.com
Sophia Jacob (AIX L3 Live Update and File Systems Support), email: sophiajacob@in.ibm.com
Sougata Sarkar (AIX System Testing), email: sougsark@in.ibm.com
Osvaldo Rodriguez (Custom Test Partnership), email: osvaldo@us.ibm.com
Thaitiel Jair Duran Valencia (Custom Test Partnership), email: thaitiel@ibm.com
Shajith Chandran (STSM, AIX Base Kernel Development & Performance) email: shajithchandran@in.ibm.com