PowerVM

 View Only
Expand all | Collapse all

VIOS reboot time after update

  • 1.  VIOS reboot time after update

    Posted Tue September 03, 2024 12:53 PM

    Hi All,

    I am planning to perform VIOS update from 3.1.3.14 to 3.1.4.41. Could anyone please let me know, how much time does a reboot take to apply the updates?

    Also, if it doesn't take that much time, could I leave the client systems (IBM i's, AIX and Linux) online to let them pick the disk and network connection after VIOS boots up?

    Note: I have single VIOS environment.

    Regards

    Justin Francis



    ------------------------------
    Justin Francis
    ------------------------------


  • 2.  RE: VIOS reboot time after update

    Posted Tue September 03, 2024 01:03 PM
    Justin,

    If you have a single VIOs, you should take a full outage of all LPARs
    to upgrade VIO. This is the safest option.

    Thanks.


    On Tue, Sep 03, 2024 at 04:53:05PM +0000, Justin Francis via IBM TechXchange Community wrote:
    > Hi All,
    >
    >
    > I am planning to perform VIOS update from 3.1.3.14 to 3.1.4.41. Could anyone please let me know, how much time does a reboot take to apply the updates?
    >
    >
    > Also, if it doesn't take that much time, could I leave the client systems (IBM i's, AIX and Linux) online to let them pick the disk and network connection after VIOS boots up?
    >
    >
    > Note: I have single VIOS environment.
    >
    >
    > Regards
    >
    >
    > Justin Francis
    >
    >
    > ------------------------------
    > Justin Francis
    > ------------------------------
    >
    >
    > Reply to Sender : https://community.ibm.com/community/user/eGroups/PostReply?GroupId=6073&MID=419340&SenderKey=4f8575c6-7c78-4f31-b370-c4eab0d3c361
    >
    > Reply to Discussion : https://community.ibm.com/community/user/eGroups/PostReply?GroupId=6073&MID=419340
    >
    >
    >
    > You are subscribed to "PowerVM" as Russell.Adams@AdamsSystems.nl. To change your subscriptions, go to http://community.ibm.com/community/user/preferences?section=Subscriptions. To unsubscribe from this community discussion, go to http://community.ibm.com/HigherLogic/eGroups/Unsubscribe.aspx?UserKey=c23dfccc-9910-40ae-beeb-fdcbced5bf1f&sKey=KeyRemoved&GroupKey=08137a30-97b3-4ae4-8d8d-fcf834e4f06e.


    ------------------------------------------------------------------
    Russell Adams Russell.Adams@AdamsSystems.nl
    Principal Consultant Adams Systems Consultancy
    https://adamssystems.nl/




  • 3.  RE: VIOS reboot time after update

    Posted Tue September 03, 2024 01:10 PM

    Thank you so much Russell!!

    And normally, how much time does a VIOS reboot takes after the update?

    Regards

    Justin



    ------------------------------
    Justin Francis
    ------------------------------



  • 4.  RE: VIOS reboot time after update

    Posted Tue September 03, 2024 01:16 PM
    On Tue, Sep 03, 2024 at 05:10:02PM +0000, Justin Francis via IBM TechXchange Community wrote:
    > And normally, how much time does a VIOS reboot takes after the update?

    That can vary based on the speed of the system, the latency of the
    boot disks, the number of PCI cards, ports, and what they connect to.

    Ideally 3-5 minutes.

    Each disconnected HBA port can add 60 seconds, depending on the model.

    No need to do a full power off, just reboot the VIO LPAR. That's faster.

    ------------------------------------------------------------------
    Russell Adams Russell.Adams@AdamsSystems.nl
    Principal Consultant Adams Systems Consultancy
    https://adamssystems.nl/




  • 5.  RE: VIOS reboot time after update

    Posted Wed September 04, 2024 02:56 AM

    and number of disks and all other devices. I had VIOS with ca. 8000 paths to disks (thanks to EMC). The reboot took ca. 30-40 minutes.



    ------------------------------
    Andrey Klyachkin

    https://www.power-devops.com
    ------------------------------



  • 6.  RE: VIOS reboot time after update

    Posted Wed September 04, 2024 03:11 AM

    Hi Andrey,

    Curious to know... you had 8000 disks on VIOS? as workloads are not run on VIOS, what do these 8000 disks used for ?



    ------------------------------
    RUPESH THOTA
    ------------------------------



  • 7.  RE: VIOS reboot time after update

    Posted Wed September 04, 2024 05:41 AM

    8000 is a lot.  8 paths per disk?

    Are you mapping them directly to VSCSI or using an SSP ?



    ------------------------------
    José Pina Coelho
    IT Specialist at Kyndryl
    ------------------------------



  • 8.  RE: VIOS reboot time after update

    Posted Wed September 04, 2024 05:26 PM

    Yes, afair 8 paths per disk, ca. 1000 disks mapped via VSCSI to client LPARs. Because of fine EMC drivers, you have one device for each path and one additional device for the disk. Instead of one device you have 9 devices and if you have 100 disks, you will have 900 devices. If each device would take 0,5 second in cfgmgr, the whole boot time will increase by 450 seconds - ca. 7-8 minutes.



    ------------------------------
    Andrey Klyachkin

    https://www.power-devops.com
    ------------------------------



  • 9.  RE: VIOS reboot time after update

    Posted Thu September 05, 2024 04:30 AM
    Edited by Satid S Thu September 05, 2024 04:32 AM

    Dear Andrey

    >>>>  I had VIOS with ca. 8000 paths to disks (thanks to EMC). The reboot took ca. 30-40 minutes. <<<<

    This is such an interesting and surprising fact to hear about. 

    >>>> 8 paths per disk <<<<

    Did you ask EMC if 4 paths per disk could possibly be configured instead?  If I was the customer, I would adamantly have asked for explanation why 8 paths were needed because I did not want to waste my investment without knowing if I received any particularly special benefit.   I cannot see how 8 paths per disk would deliver any special benefit over 4 paths.  Did EMC explain why specifically 8 paths were needed?  



    ------------------------------
    Satid S
    ------------------------------



  • 10.  RE: VIOS reboot time after update

    Posted Fri September 06, 2024 07:52 AM

    What?  Eight paths to disk is not the common way to do it?

                          Display Disk Path Status                
                                                                  
                    Serial                     Resource   Path    
    Entry ASP Unit  Number          Type Model Name       Status  
        1   1    1  Y438C4000054    2145  050  DMP060     Active  
        2           Y438C4000054    2145  050  DMP117     Active  
        3           Y438C4000054    2145  050  DMP175     Passive 
        4           Y438C4000054    2145  050  DMP001     Active  
        5           Y438C4000054    2145  050  DMP059     Passive 
        6           Y438C4000054    2145  050  DMP118     Passive 
        7           Y438C4000054    2145  050  DMP176     Active  
        8           Y438C4000054    2145  050  DMP002     Passive 
        9   1    2  Y438C4000071    2145  050  DMP292     Active  
        ...

    Let's see...
    Two LPARs of VIOS.
    Two 2 port FC cards per VIOS LPAR.
    Two fiber channel switches between the Power system and the SAN.
    Add a couple of ports on the SAN...
    I've done VIOS maintenance, and FC switch maintenance, and kept running.  And I've upgraded the SAN OS midday, midweek.



    ------------------------------
    Robert Berendt IBMChampion
    Business Systems Analyst, Lead
    Dekko
    Fort Wayne
    260-599-3160
    ------------------------------



  • 11.  RE: VIOS reboot time after update

    Posted Fri September 06, 2024 08:33 PM
    Edited by Satid S Fri September 06, 2024 08:48 PM

    Dear Robert

    I see now it means 4 paths per VIOS. I originally thought 8 paths per VIOS!  But then this causes a confusion for me because all my customers user 4 paths NPIV per VIOS and it does not take more than 5 minutes to IPL each VIOS.   Perhaps, the long VIOS IPL time Andrey mentioned has to do with the fact that it uses vSCSI and there are so many LUNs involved?

    And the mentioning of vSCSI is another puzzle for me. Since I have a few IBM i customers using EMC and all use NPIV for connection.  Do you have any idea why vSCSI is used here? 



    ------------------------------
    Satid S
    ------------------------------



  • 12.  RE: VIOS reboot time after update

    Posted Sat September 07, 2024 01:34 PM
    Edited by Marc Rauzier Sat September 07, 2024 01:39 PM

    And got up to 16 paths on an installation I designed several (6 or 7) years ago. 4 were active and 12 were passive. The storage device was configured to provide HyperSwap functionnality to IBM i PowerHA clusters using LUN level switching, based on NPIV.

    It was based on this redbook "IBM Storwize HyperSwap with IBM i" and was using the same configuration as you show, e.g., for each IBM i LPAR,  2 VIOS, 2 distinct physical FC cards on each, 2 distinct FC switches, and on each V7k (I don't remember the exact model), two controllers with two distinct FC ports.

    At this time, we were running V7R3 and I wanted to confirm that IBM was supporting this setup. I never got an official statement, but when carefully reading Maximum Capacities documentation, one can that see there is a small difference between V7R2 and V7R3:

    V7R2 (and older): Maximum number of connections to a logical unit or disk unit in an external storage server or Virtual I/O Server environment: 8

    V7R3 (and newer): Maximum number of active connections to a logical unit or disk unit in an external storage server or Virtual I/O Server environment: 8

    The setup was using only 4 active paths (so less than 8) in any case, but the unique response I got was something like "We always do our best to help customers", and I was an IBMer at this time :-)



    ------------------------------
    Marc Rauzier
    ------------------------------



  • 13.  RE: VIOS reboot time after update

    Posted Mon September 09, 2024 02:39 AM

    Dear Marc

    >>>> And got up to 16 paths on an installation I designed several (6 or 7) years ago. 4 were active and 12 were passive. <<<<

    Thanks for sharing your experience but I'm quite curious as to what the "compelling reason" is to configure 12 passive paths instead of just 4?   Is the specific use of HyperSwapfor IBM i the cause to configure 12 passive paths?  If so, what the reason is for configuring these excessive passive paths?



    ------------------------------
    Satid S
    ------------------------------



  • 14.  RE: VIOS reboot time after update

    Posted Mon September 09, 2024 06:08 AM

    Hello Satid

    I don't know if you are familiar with Storwize (when I was working it was the name of the devices family more or less managed by an SVC, such as V7k, V9k, ..., I am not aware of the name IBM is using right now) based storage devices, so let me explain what was the setup.

    A Storwyze device provides two controllers, with only one active at any time. This allows controller failures or software update/upgrade without disruption. Best practices are to use both controllers for any volume. The result for the paths is that you double the number of paths from an LPAR to the LUNs, half of the paths are active (e.g. using the active controller), half of the paths are passive (using the other controller).

    HyperSwap provides the capability to connect two storage devices, configure synchronous replication in both ways for selected volumes, create virtual volume for each pair of real volumes (this specific attribute was the reason of a tricky copy description configuration step for PowerHA on IBM i, but this is another story and maybe, this is easier now). In an HyperSwap relationship, only one device is active at any time (from the volume point of view) (replication exists only from the active device to the other), but all paths to the inactive device must exist, in passive state for the IBM i LPAR, so that an automatic failover (including replication reversing) operation applies in case of a failure of the active device.

    With a simple setup, I mean 1 VIOS using 1 FC adapter, and without using HyperSwap, you have 2 paths, 1 active path to the active controller of the storage device, 1 passive path to the inactive controller. Here, you have redundancy only if you loose the active storage device controller.
    Now, if you use 2 distinct FC adapters, connected to two distinct switches (and preferably zoned to distinct ports on the storage device) on the VIOS, you get 4 paths (2 active to the active controller and 2 passive to the inactive controller). You add redundancy at the switch (and FC adapter) level.
    Of course, you will setup dual VIOS, both with similar configuration. And you get the 8 paths (4 active to the active controller and 4 passive to the inactive controller). This is the configuration that we can see in Rob's post. You add redundancy at the VIOS level.

    And now, after setting up HyperSwap with another storage device, you get my 16 paths, still the same 4 active paths to the active controller of the active device, 4 passive path to the inactive controller of the active device, and 8 passive paths to the inactive device (4 to each controller).

    With this setup, you have multiple redundancy levels. Of course, you can loose a single one (a FC adapter, a VIOS, a switch, a device controller or an entire device) level without disruption. But you can loose up to 3 items at the same time and the LPAR can still run (for instance, at the same time, a VIOS, a switch, a device controler or even an entire device).

    Regarding the VIOS reboot time, we are talking here of NPIV configuration. So, virtual to physical FC adapters mapping does not have any significant impact.

    Hope that I replied to your question and did not increase any confusion :-)



    ------------------------------
    Marc Rauzier
    ------------------------------



  • 15.  RE: VIOS reboot time after update

    Posted Tue September 10, 2024 06:24 AM

    With HyperSwap configured on Storwizes initatior can potentially access the same LUN on both storage arrays via any path to each array. The array that performs I/O for this LUN has half of its paths active (the ones to "master" controller for this LUN) and the other half passive/standby (the ones to the other controller), the standby Storwize has all of its paths in passive/standby state, of course.

    This way you have 4 active + 4 inactive + 8 inactive = 16 paths total.



    ------------------------------
    Lech Szychowski
    ------------------------------



  • 16.  RE: VIOS reboot time after update

    Posted Mon September 09, 2024 01:32 PM

    Hi Satid,

    this is quite usual configuration in a highly available environment, as others already pointed to it. It doesn't matter which storage you use, you will get most of the time 8 to 16 paths. In this case it was paths to two mirrored storage devices through two fabrics. Each storage device had 4 controllers, so it could be even more ;-) Anyway it was protected against every possible failure in SAN or in IBM Power domain - SAN switch, SAN fabrics, storage, datacenter, FC HBA, or VIOS. 

    The only problem with EMC is that they didn't support AIX ODM so well as their proprietary drivers on AIX. Not all functions were (still are?) available with pure ODM connection. This is understandable - their drivers cost money, ODM is available for free. But it means that it is complicated to use EMC storage for AIX rootvg and many customers use rootvg on VSCSI disks if they have EMC storage. This is quite usual. I also heard once opinion that it is impossible to install AIX on EMC storage using NPIV, which is wrong. The configuration is described in EMC Host Connectivity Guide and can be used, but it brings a lot of problems with it during AIX updates.



    ------------------------------
    Andrey Klyachkin

    https://www.power-devops.com
    ------------------------------



  • 17.  RE: VIOS reboot time after update

    Posted Tue September 10, 2024 06:15 AM

    Dear Marc and Andrey

    Thanks for both your additional explanations that help me see the use of SAN disk and data replication in a different way from my experience. I used to work with 7-8 IBM i customers who used Storwize SAN storage family but all used logical replication instead of PowerHA for i disk replication or Hyperswap.   So, I was only familiar with the use of 4+4 paths as shown in Robert's post.   Gee! This make me feel fortunate to work with IBM i and logical replication as they appear to be less complicated than in UNIX or disk replication  :-) 



    ------------------------------
    Satid S
    ------------------------------



  • 18.  RE: VIOS reboot time after update

    Posted Tue September 10, 2024 09:40 AM

    Hi Satid

    I have worked with both and, in my opinion both hardware and software replication have pros and cons and the usual "it depends" reply applies!

    "PowerHA for i" might appear more complex to setup but, once you configured it a couple of times, it becomes easy to use. The pain point is to move from full SYSBAS to IASP (I believe IBM never imagined all the various details that customer systems could have to make it difficult!). Once this step fully complete (administrative domain configuration and maintenance is not easy, and I think IBM/Fortra did not provide all required tools to make it easier, unless they made progress since I retired), day to day operations and switchover tests are simple. In my opinion, software replication requires an higher monitoring workload, but I might be wrong with improvements over the years.


    I faced another constraint with "PowerHA for i" which was related to the way the organization I was working for was handling various platforms. People assigned to the Storage side (switch and devices) and people (like me) assigned to the Server side were not under the same management line. And for setting up and managing PowerHA as automated as possible, the server side must have access to switches (for zoning configuration) and storage devices (for volumes, host connections and other configurations, and for day-to-day operations such as FlashCopy, FailOver, FailBack...). Cooperation is not always easy due to responsibilities boundaries (and those boundaries become more difficult to cross when you start talking about tools like PowerVC...).

    Unfortunately, I had no opportunity to work on a "Db2 Mirror for i" project, because I believe that this product will help a lot some organizations with their "0 downtime" projects, and I would have loved see it in action.



    ------------------------------
    Marc Rauzier
    ------------------------------



  • 19.  RE: VIOS reboot time after update

    Posted Wed September 11, 2024 11:53 AM

    I mostly agree with Marc on this one, and find myself in the unheard situation of disagreeing with Satid.  The difference is that I am solidly in the storage replication camp from a functionality perspective.

    As a High Availability consultant I have the advantage of experience with many different HA and DR solutions.  To me, there is no comparison between any of the logical replication products (journal based replication of objects) and storage replication.  When properly configured, the storage replication always has a consistent recovery point where the entire system (or IASP) is recoverable - usually with a RPO of 5 minutes or less for asynchronous replication, or even RPO of 0 for synchronous.  There is no question of whether any given object is replicated or not, or whether it is at the same sync point as any other object.  The role swaps just work.  DR tests become easy.  Monitoring becomes easy.

    There are many possibilities - PowerHA for i is perfect for IASP configurations, and is a great solution for customers that can move applications to IASP. 

    The Expert Labs  PowerHA toolkit for IBM i can also handle Full System replication (and has many benefits for IASP based replication), flash copy (full system or IASP) can be used to eliminate backup downtime, and safeguarded copies to protect against malware, and many other bad news scenarios.

    If you want insulation from storage array failures, Hyperswap with its additional paths can be a small price to pay for that protection.

    The one benefit logical replication used to have over storage replication was the ability to use the target as a read-only mirror for reporting, and with the availability of DB2Mirror, there is a alternative for that as well.  I should also point out if that reporting is based on specific point in time (e.g. completion of daily batch processing), use of the flashcopy toolkit is a solid alternative for offloading reporting as well, whether is be regular data warehouse reporting or just keeping your query power users from wrecking your production system performance.



    ------------------------------
    Vincent Greene
    IT Consultant
    Technology Expert labs
    IBM
    Vincent.Greene@ibm.com


    The postings on this site are my own and don't necessarily represent IBM's positions, strategies or opinions.
    ------------------------------



  • 20.  RE: VIOS reboot time after update

    Posted Wed September 04, 2024 07:17 AM

    Hi Justin,

    I wanted to add to Russell's post regarding the upgrade process. It's crucial to shut down all your LPARs before proceeding. When the VIO server is rebooted, the LPARs will lose all I/O connections, so shutting them down beforehand helps safeguard your data and prevent any potential corruption from active applications or databases.

    Keep in mind that during an upgrade or migration, there are often additional steps, like applying software, that can extend the reboot time compared to a standard IPL. I would estimate the process could take around 10-15 minutes, depending on various factors.

    Also, I'm curious-why do you only have one VIO? A dual VIO setup would offer greater resilience and allow you to upgrade each server individually without disrupting LPAR connections.

    Thanks

    Lance Martincich



    ------------------------------
    Lance Martincich
    ------------------------------