Hey all,
Be aware that we are tracking an issue in 7.5.0 Update Package 5 where scheduled maintenance windows take longer than expected to complete when removing glusterfs files during the postpatch clean up. The root cause of the issue is glusterfs file clean up to remove the files. For most users, they are expected to see the following error message for 5-10 minutes while clean up occurs.
If you upgrade to 7.5.0 UP5 and the host appears to be halted on glusterfs, let it run and do not attempt to reboot the appliance.
-- APAR text --
IJ45878: QRADAR UPGRADES TO 7.5.0 UPDATE PACKAGE 5 CAN TAKE AN EXTENDED AMOUNT OF TIME TO COMPLETE
Users who upgrade to QRadar 7.5.0 Update Package 5 can experience an issue where the upgrade takes longer to complete than expected. It has been identified that postpatch messages related to the removal of glusterfs files can cause the upgrade to appear to be halted or stuck when the upgrade is still in progress. The length of time required to clean up the glusterfs files depends on the available resources on the QRadar instance. The extended upgrade window issue impacts all users who attempts to upgrade to QRadar 7.5.0 UP5.
On-screen output
QRADAR-11816 [postpatch] Removing the remaining glusterfs things from host
QRADAR-11816 [postpatch:remove_glusterfs] Removing files related to glusterfs
Upgrade time impact
A. Most users reported that the upgrade appears to be halted for 5-10 minutes.
B. A few users reported that the upgrade appeared to be halted for 1 or 2+ hours during the glusterfs file removal.
IMPORTANT: It is critical that users allow the upgrade to continue. Rebooting or restarting QRadar can lead to service issues where hostservices and docker do not start as expected. Restarting an upgrade before it completes can lead to data outages or rebuilding hosts that were restarted while an upgrade is in progress.
Workaround
Allow the upgrade to complete and do NOT restart the QRadar host. If the upgrade is halted on a 'Removing files related to glusterfs' message for more than 1 hour, contact support to report this issue.
-- End APAR IJ45878 --
As mentioned, most users should only experience the issue for 5-10 minutes. However, we have seen cases where the time was significant 1-2+ hours for a host on older hardware or smaller VMs. Interrupting the upgrade with a hard reboot/restart prevented hostservices and docker from starting properly as the postpatch processes did not complete and support was required to repair those hosts.
All users who upgrade can experience extended maintenance windows, so we are alerting users and updating the release notes for 7.5.0 UP5.
As always, if you have questions or concerns let me know.
Jonathan