PowerVM

Firmware Best Practices to Minimize Downtime

By Chet Mehta posted Sat June 06, 2020 02:09 PM

  
Each year, between the Technical Collaboration Council (TCC) and Large Users Group (LUG), we meet with some of our largest AIX & IBM i Power Systems customers in Austin and Rochester, for a week long immersive conference. Our development team looks forward to these events since it provides an excellent opportunity to not only get feedback on our future plans (under NDA of course) but also hear first-hand customer challenges so we can work to address them.

Over the years, Firmware maintenance has surfaced as a common pain point. Since our development team spends a considerable amount of time providing security or other fixes (via Service Packs) for issues uncovered post-GA, it’s troubling (both for us and customers) to see a system impacted by an issue for which the fix was previously released.

With customers consolidating many workloads on a single system, the impact of a firmware defect can be magnified. On the flip side, workload consolidation makes it challenging for customers to schedule maintenance windows to update firmware. The result unfortunately is that many customers fall far behind on firmware maintenance exposing themselves to issues that are already fixed in later Service Packs or Releases.

This blog aims to provide a set of Firmware best practices that should minimize customer impact from known issues while maximizing system availability.

System Firmware Releases & Service Pack Cadence

The first generation of a Power System (e.g. Power8 Low-end) is supported by a new System Firmware release (e.g. FW 810). Support for additional systems (e.g. Power8 Mid-range or High-end) in the same generation is provided by subsequent releases (e.g. FW 820 or FW 830). The typical duration between releases is 6-months. Whenever possible, new releases include support for systems from prior releases in the same generation (e.g. FW 830 supported Power8 Low-end, Mid-range & High-End servers).

Each release is supported for a minimum of 2 years which means that we plan quarterly delivery of Service Packs during the 2-year window. The last System Firmware release for a generation (e.g. FW 860 is the last release for Power8 servers) is supported until the product is withdrawn (at least 5 years). For this last release, Service Packs are planned quarterly for 2 years with follow-on Service Packs delivered if/when necessary. Additionally, the last System Firmware Release of a generation is meant to be all encompassing in that it includes not only all fixes from prior releases but also supports all systems in that generation (e.g. FW 860 supports all PowerVM Power8 servers).

Figure 1 below illustrates the Release & Service Pack cadence for Power8 System Firmware Releases that support PowerVM.

Firmware Release Timeline
Figure 1: System Firmware releases for POWER8 PowerVM Servers


System Firmware Release & Service Pack Facts

  • System Firmware Releases (i.e. Upgrades) are always disruptive.
  • Service Packs (Updates) with-in a release are cumulative (i.e. includes all fixes from prior Service Packs) and typically concurrent (i.e. fixes can be applied and activated concurrently without requiring a system or partition reboot).
  • In rare cases a Service Pack may contain some fixes that are deferred (implies that the fix can be applied on the system but won’t activate until the system/partition reboot).

System Firmware Best Practices

  • When a new Service Pack is released, review the README for the Service Pack, particularly the Service Pack Fix List to see if any critical fixes are applicable to your environment / configuration.
  • If a Service Pack includes a High Impact / PERvasive (HIPER) fix that is applicable to your environment, it is recommended that the Service Pack be installed as soon as a maintenance window can be scheduled.
  • It is okay to allow a “deferred fix” to remain pending until the next scheduled reboot.
  • It is important to be on a Release Level that is supported with Service Packs.
  • If you don't require the features / functions being introduced via a new Release Level you may stay on the older Release Level as long as it is supported (i.e. continuing to deliver Service Packs).

System Firmware Update & Upgrade Recommendations

As a general rule we recommend customers plan twice per year Firmware maintenance though the frequency can be tailored to suit your environment. If firmware maintenance is planned twice a year, one of these may need to done during a maintenance window that can support an upgrade to move off a Release that is no longer supported. The other maintenance window can be for an update (i.e. move to a newer service pack at the current release) which can be done concurrently.

As a point of reference we see that our "Best of Breed" customers are able to perform firmware maintenance every 6-months with a 1-year firmware maintenance cycle being "above average". Not surprisingly, most of these customers leverage LPM when necessary (either via the LBS LPM SRR Automation Tool or scripts) to avoid impacts to their critical workloads. Customers with firmware maintenance duration >1 year have a high probability of being impacted by issues that could have been avoided by staying more current on System firmware. Setting a goal to update firmware twice per year will help customers achieve at least a yearly cadence for firmware maintenance.

When scheduling Firmware Updates/Upgrades, customers should consider,
  • Maturity of the system and Release Level currently installed.
  • Applicability of fixes contained within the Service Pack to your environment.
  • Exploitation of new features or functions (H/W or S/W).

While new systems shipping from manufacturing always include the latest available Service Pack on the current System Firmware release, customers updating / upgrading their systems are advised to use the levels recommended by the Fix Level Recommendation Tool (FLRT). The recommended Firmware levels factor in field performance of the Release/Service Pack and also any soon to be released Service Pack that might include a HIPER fix. Following this strategy will greatly minimize impact from any rare but possible scenarios like regression in the newest Service Pack or release of a HIPER service pack soon after update/upgrading to newer firmware.

Figure 2 illustrates possible Update / Upgrade path for twice per year Firmware Maintenance

2X per year FW Maintenance
Figure 2: Example of twice per year Firmware Maintenance

 

Figure 3 shows possible Update / Upgrade path for once per year Firmware Maintenance


1X per year FW Maintenance
Figure 3: Example of once per year Firmware Maintenance


Conclusion

While it is important to keep current on Firmware, we recognize that customers must balance that against their business need for system availability. By following the recommendations in this blog, customers will be able to strike the right balance of staying current on firmware maintenance while minimizing system downtime.

Contacting the PowerVM Team

Have questions for the PowerVM team or want to learn more?  Follow our discussion group on LinkedIn IBM PowerVM or IBM Community Discussions
0 comments
8 views

Permalink