AIX

 View Only
Expand all | Collapse all

How can we update service pack on multiple AIX servers at once during maintenance window.

  • 1.  How can we update service pack on multiple AIX servers at once during maintenance window.

    Posted Wed May 01, 2024 06:00 AM

    Hello Team,

    How can we update service pack on multiple servers at once during maintenance window. We need to reboot the servers as well after the SP update. 

    How we can achieve this? We have fixed window for 4 hr for all servers (approx 200 in a group). 



    ------------------------------
    Manoj Kumar
    ------------------------------


  • 2.  RE: How can we update service pack on multiple AIX servers at once during maintenance window.

    Posted Wed May 01, 2024 06:26 AM
    Planning ahead.

    1) have a second boot disk that you can use on each system.
    2) using nim, alt-disk clone boot disk, apply updates to target
    3) you probably already have ILMT and bigfix, but other methods would work to kick off remote reboots, during your outage window, all you're doing is rebooting.

    Depending on how volatile /home and /var are, ( and any other FS in rootvg), they may need to be moved out of rootvg
    Else you may need to resync the disks after the reboot 
    This also depends on how long it is between the time you do the clone/update, to when you actually reboot

    We have a farm of about 150 systems, our  outage window is 90 minutes to cover fallback, but, if everything goes as planned, we're down and up in 30-45 a lot of time just waiting for apps to come down
    With bigfix/whatever, you define what groups you want to work with if not doing everything

    Then it's just going through and validating

    I'm on my phone or I could cut/paste more detail if you'd like 





  • 3.  RE: How can we update service pack on multiple AIX servers at once during maintenance window.

    Posted Wed May 01, 2024 06:55 AM

    Thanks Tom for the details.

    Currently we had tool to patch all environment and now we are replacing that tool with other tool (that is still IN progress as per SP update). Till the time that new tool go live. Do we have any script which we can trigger to update the service pack?

    Here is our steps for performing service pack update:

    1)  We take the clone before the outage window (May be 1 or 2 day before)

    2) We have service pack available on NIM server so we just mount the directory as NFS on client.

    3) On client we install service pack update then reboot the server

    Basically during maintenance window we apply the service pack on actual image rather than on clone disk.

    Here if we are talking about to apply the service pack on clone disk which method we will have to follow? and if we have to apply the service pack on original disk how can we apply that. Remember reboot is required in both cases.

    Any suggestion/help would be really appreciated. 

    Note: In case server is rebooted with clone disk then one/two packages always break down specially guardium software. They had to install/update to work.



    ------------------------------
    Manoj Kumar
    ------------------------------



  • 4.  RE: How can we update service pack on multiple AIX servers at once during maintenance window.

    Posted Thu May 02, 2024 04:03 AM

    Hi Manoj, 

    You can use alt_disk_copy command to apply the service pack on the cloned disk. Since, you have a NIM server already, you can also use NIM's alt_disk_install operation to do this. 

    As you said, a reboot is required in both the cases (updating on original disk and updating on cloned disk). But, as you have a less maintenance window and the servers are more, with alt_disk_copy you need not take the backup/clone manually. It will take care of backup, update and setting the bootlist by itself. All, you have to do is a reboot to the cloned disk after the update is done.



    ------------------------------
    Saikrishna Akkela
    ------------------------------



  • 5.  RE: How can we update service pack on multiple AIX servers at once during maintenance window.

    IBM Champion
    Posted Thu May 02, 2024 05:54 AM

    Hi,

    If you are triggering 200 client SP updates simultaneously from the same NIM share over NFS, most probably your NIM server's IO will be screaming for death...and be veerrryy slow ....I would not go this way.

    Not knowing all the details, but if the circumstances are the ones mentioned (200 AIXes, 4hours window) ; I would do ; especially if you have some remote Tool (Ansible etc) in use, without that ..things are "bit" more complicated.

    • Run mksysb 1-2 days before (assuming rootvg IS really just for OS eg very static, NO any application filesystems included there.) (if doable, mksysb to VIOS optical library recommended, since restore from there is like 95% faster than via NIM..) 
    • Create temp FS locally on each server to hold SP medias (10G should be enough)
    • Copy medias locally on each server

    Then on the actual maintenance window:

    • Stop apps (Hopefully there is some remote procedure for this as well..?) Since usually apps stop/start is one of the tasks taking most of the time (especially on WAS / JAVA env with zillions of Java apps) 
    • Trigger the SP update (since medias are local, it is a LOT faster than running 200 simultaneous runs over NFS)
    • Reboot 
    • Start apps 
    • Remove Temp FS holding the medias


    ------------------------------
    Tommi Sihvo, Lead Service Architect
    Tietoevry Tech Services
    email tommi.sihvo@tietoevry.com mobile +358 (0)40 5180 Finland
    ------------------------------



  • 6.  RE: How can we update service pack on multiple AIX servers at once during maintenance window.

    Posted Thu May 02, 2024 05:01 AM

    You can use groups in NIM to patch a number of server in a single operation.



    ------------------------------
    Phill Rowbottom
    ------------------------------



  • 7.  RE: How can we update service pack on multiple AIX servers at once during maintenance window.

    IBM Champion
    Posted Thu May 02, 2024 08:29 AM

    Reboot of an AIX server lasts ca. 5 to 10 minutes. If you have 4 hours maintenance window, you have 240 minutes. It is usually enough. I really saw only one application in my life which started more than 2 hours.

    As many already suggested, do clones, do alt_disk_copy and make updates in the alt_rootvg instead of rootvg.  Then you need only 15 minutes of downtime to reboot AIX.

    If you have PowerHA for your production workloads you usually don't even need these 15 minutes downtime. Depending on how fast your application can be moved to another node.

    What else I'd recommend is to look at Live Update. If it works in your environment, you can save even more minutes.



    ------------------------------
    Andrey Klyachkin

    https://www.power-devops.com
    ------------------------------



  • 8.  RE: How can we update service pack on multiple AIX servers at once during maintenance window.

    Posted Fri May 03, 2024 06:32 AM

    First of all thank you everyone for the response. I am happy that I am getting lot of thoughts on this.

    But my question still same. If we have to update service pack on bulk servers how can we update on all servers at once without any tools. 

    If we are updating service pack and rebooting the servers one by one by any method, it is always going to take time. Because some servers even might take time to come up. We always apply the service pack on original image, for backup purpose we take the clone before 1 or 2 day. If we are only rebooting the servers during patching only how can we reboot all servers together at once. Earlier when we had tool to patch the servers, we can easily patch/reboot all servers together. But is there any way in AIX we can achieve this thing without any tool.



    ------------------------------
    Manoj Kumar
    ------------------------------



  • 9.  RE: How can we update service pack on multiple AIX servers at once during maintenance window.

    IBM Champion
    Posted Fri May 03, 2024 06:43 AM

    Manoj,

    reverse your thinking. If you had a backup clone before, now you have you clone to install a new version and your original rootvg  is your backup. This is the whole clue in the story.

    • Create altinst_rootvg with alt_disk_copy 30 minutes before your maintanance window
    • Install new service pack into it with alt_rootvg_op
    • Reboot the server from the new rootvg when you have the maintenance window

    You can reboot as many servers simultaneously as you want using Ansible or even simple dsh

    The time you need to prepare the updates depends on your infrastructure performance. I wrote 30 minutes before the maintenance window. But frankly I personally do it usually automatically in the night before the maintenance window. Yes. using Ansible. But you may use anything you want. Scripting of the procedure using plain old Korn Shell is very easy and you can schedule it using cron. No need for additional tools if you don't want to have them.

    P.S. Yes, welcome to Common Europe Congress 2024 in Milan, Italy, where we (hopefully) will talk about automated patching procedures and strategies for AIX.



    ------------------------------
    Andrey Klyachkin

    https://www.power-devops.com
    ------------------------------



  • 10.  RE: How can we update service pack on multiple AIX servers at once during maintenance window.

    IBM Champion
    Posted Fri May 03, 2024 06:43 AM

    Hi,

    Assuming your AIX lpars are visible on HMC + RMC connection is Active, U can do operating system type of restart for multiple lpars via HMC with "one-click":

    Select needed Lpars > Actions > Restart > Operating System ( Issue the operating system command to shut down and restart the logical partition normally)



    ------------------------------
    Tommi Sihvo, Lead Service Architect
    Tietoevry Tech Services
    email tommi.sihvo@tietoevry.com mobile +358 (0)40 5180 Finland
    ------------------------------



  • 11.  RE: How can we update service pack on multiple AIX servers at once during maintenance window.

    Posted Fri May 03, 2024 07:44 AM
    Using just standard AIX, without other "tools"... 
    Can you ssh into the target systems using keys?  If so, spin through them and issue a "sudo shutdown -Fr 0" or some appropriate script that will reboot the remote systems





  • 12.  RE: How can we update service pack on multiple AIX servers at once during maintenance window.

    Posted Fri May 03, 2024 09:58 AM

    Thank you again. 

    @Andrey: We wanted to use original image to update the service pack update but if that is not possible with original, may be we will use clone image as you mentioned. Actually whenever we reboot the server with clone image one or two security software get crashed. so they have to fix it. We wanted to avoid that situation. We don't have dsh in our environment, so how can we install that. Out jump server is Linux server. Can we proceed from that jump server to all AIX LPAR. All AIX lpar are password less from that jump server.

    @Tom: Will this command "shutdown -Fr 0" run sequentially or parallelly?



    ------------------------------
    Manoj Kumar
    ------------------------------



  • 13.  RE: How can we update service pack on multiple AIX servers at once during maintenance window.

    IBM Champion
    Posted Fri May 03, 2024 10:18 AM

    Manoj,

    it is not impossible. It is your requirements. AIX update can take from 10 minutes up to several hours depending on your AIX configuration and your applications. You can make your downtime very short - in this case you follow the way I and others described here. Or you stay with your way of doing things and have longer downtimes. Your environment - your decision!

    Regarding security software got crashed I can't say anything. It depends on the software and you must open a case at the software's vendor support to get this fixed. I remember, for some software I had to regenerate internal software agent IDs every time I did a clone.

    dsh is standard part of AIX and is installed with dsm.dsh fileset:

    #lslpp -w /usr/bin/dsh
      File                                        Fileset               Type
      ----------------------------------------------------------------------------
      /usr/bin/dsh                                dsm.dsh               Symlink
    

    If you use Linux jump server and all your AIX servers are passwordless (shame on your security), you write a small shell script to reboot your servers. Something like:

    for i in $(< myserver) ; do
      ssh $i shutdown -Fr &
    done

    Sorry I didn't test it and it is not something I usually do. I use Ansible or a little bit more complicated scripts. But I hope you understand the idea - start shutdown on each server in background and don't wait for it to finish. Then sleep for some time (let's say 5 minutes) and ping the servers if they came back.



    ------------------------------
    Andrey Klyachkin

    https://www.power-devops.com
    ------------------------------



  • 14.  RE: How can we update service pack on multiple AIX servers at once during maintenance window.

    Posted Fri May 03, 2024 04:58 PM
    Shutdown -Fr 0. # runs on a single system, so run it on the target, not. Your jump box .. don’t know if sequential or parallel applies. It’s just 1 system, 1 command.

    dsh isn’t required

    If you have a text file with the hosts you want to reboot (hostnames, or IP addresses.
    cat syslist.txt
    10.123.123.10
    10.123.123.11
    10.123.123.12


    Then, run a for loop
    for S in $(cat sys list.txt)
    do
    ssh -nq $S sudo shutdown -Fr 0 > ${S}-reboot.log 2>&1 &
    done


    You’l have to play with this a bit in your environment. Some sudo configurations require a tty. SO you’d have to specify -tt on the ssh command
    Which also means it couldn’t run in the background.
    If the ID on the jump box doesn’t match the ID on the target server, then you’d have to specify that on the command line.
    Remember you need to ssh to an ID that has the ability to run shutdown directly or via sudo (or some equivalent command

    This is just one brute force method…
    I’m surprised you have so many servers and don’t have ILMT running, and thus big fix available…. But, maybe it’s just us that gets the IBM auditor abuse..

    Tom




  • 15.  RE: How can we update service pack on multiple AIX servers at once during maintenance window.

    IBM Champion
    Posted Tue May 07, 2024 03:45 AM

    With all the references to "shutdown -Fr 0" it might be worth reading the below from @Russell Adams.

    "I have to lecture everyone about this...

    Never never never use shutdown -F! Can I get fireworks and sparkles?

    The -F flag for shutdown means FORCE down, not fast down. If your giant database doesn't come down in 60 seconds, -F will kill -9 it resulting in a crashed database. I've had multiple customers cause HA failovers (killed clstrmgrES) and many needless integrity checks of crashed DBs due to this.

    Regular shutdown is fine. It waits and sends repeated kill -15 until everything exits cleanly. Only use -F if you're already shutting down, and it isn't coming down and appears hung.

    I think IBM training got lazy years ago and taught "shutdown -Fr" instead of "shutdown -rl now". No one adds -F to mean force, they just forget to use "now" instead of the default delayed shutdown.

    If you're paying attention, -l means log the shutdown. You do check your boot and console alog's after booting? You want to keep records of what happened? Then log your shutdown.
    "

    Steve



    ------------------------------
    Steve Munday
    AIX, IBM i, HMC, PowerVM
    ------------------------------



  • 16.  RE: How can we update service pack on multiple AIX servers at once during maintenance window.

    Posted Tue May 07, 2024 05:14 AM

    Hello Team, thank you everyone for the swift response and will try to update the things as you mentioned.

    Really appreciated everyone for the support and guidance.



    ------------------------------
    Manoj Kumar
    ------------------------------



  • 17.  RE: How can we update service pack on multiple AIX servers at once during maintenance window.

    Posted Wed May 08, 2024 11:03 AM
    Thanks for the explanation… I guess I’ve been ignorant since V3.1 beta muscle memory just types -Fr 0
    I’ll work on -rl 0 and check it out.

    Thanks.