PowerVM

 View Only
  • 1.  "deep" LPM validation of all our LPARs

    Posted Fri March 15, 2024 06:25 AM

    Hello, 

    We are having more than 200 lpars (Aix / Suse) distributed among near 30 power.  E850, E950, E980, E1080

    We would like to put in place a weekly (pre)validation of LPM of all our lpar to be sure in case of problem to be able to LPM them without any problem.

    All our LPARs are virtualized, powerVM at the latest  level as for HMC.

    What will be the right commandline to use to "fully/deeply" test the LPM validation.

    > migrlpar -o v -m [source cec] -t [target cec] -p [lpar to migrate]            perhaps additional options may be relevant for addational validation

    If few days/veeks we will deploy SRR , perhaps a similar command 

    > rrstartlpar -o validate -m <source server> -t <destination server> -p <lpar name> | --id <lpar id>   ... additional options ?

    many thanks for your support



    ------------------------------
    JF Noel
    Kyndryl Architect
    ------------------------------


  • 2.  RE: "deep" LPM validation of all our LPARs

    Posted Mon April 22, 2024 11:41 AM
    Edited by Carl Gerlach Tue April 23, 2024 11:47 AM

    I am extremely interested in your findings. I see no one has responded with a comment untill now. We have been going through the process of validating everything that could cause LPM to fail, is corrected on 450+ AIX Servers across 21 Frames.
    One setting we recently came across was the " Enable Redundant error path reporting" option. Having it enabled on an AIX LPAR will cause LPM to fail. Most of the time however, the validation is successful, but the actual migrate fails for a variety of reasons. 

    -Carl
    Infrastructure Design Engineer.

    IBM AIX 




  • 3.  RE: "deep" LPM validation of all our LPARs

    Posted Tue April 23, 2024 08:39 AM

    I don't think such deep validation is possible. There are so many nuances which can fail LPM. Once I couldn't migrate an LPAR to a system because SAN switch already had the same WWPN registered on the target system. I have no clue, how and where it became from. HMC didn't show the WWPN, but the SAN switch saw it there and thus the LPM failed.

    Most of the time if you do LPM validation (migrlpar -o v) it is sufficient to know if LPM will fail or not. All options to migrlpar depend on your environment. You should standardize on some way doing migrations and develop a script or some sort of automation to do it with all options you need. Then you can use the same command with -o v to validate it weekly.



    ------------------------------
    Andrey Klyachkin

    https://www.power-devops.com
    ------------------------------



  • 4.  RE: "deep" LPM validation of all our LPARs

    Posted Fri May 24, 2024 09:54 PM

    Dear Jean-Francois

    From Andrey's response that checking with command does not always mean the actual mobility will succeed, why not routinely do LPM (or offline mobility if possible and during low workload period) for a number of high priority LPARs at a time over a year time span?   Quite a few of my customers who have DR systems move their workload between PRD and DR systems (not necessarily using LPM) every 6 months or so to ensure the switchover really works because this is the only best way to know for sure that it works. I see it is sensible for you to do this at least for high priority LPARs, if not for all.



    ------------------------------
    Satid S
    ------------------------------



  • 5.  RE: "deep" LPM validation of all our LPARs

    Posted Mon May 27, 2024 03:26 AM

    Hello , 

    Thanks to you Satid and to Andrey, for your answers, 

    I fully agree with your remarks, best way to test LPM is to do it in real life ;) .

    We are planning to do once a year a LPM from one site to the other for all our production system (DRP test exercise) , this is in plan. 

    But we are also regularly doing LPM for workload rebalancing or frame maintenance. 
    Sometimes unexpected LPM fails and stay in an inconsistency state, 100% cpmpleted but in hang state, forcing the Lpar to be stopped and restarting causing service unavailability.
    Any good advise ?

    Have a good day.



    ------------------------------
    Jean-Francois Noel
    ------------------------------



  • 6.  RE: "deep" LPM validation of all our LPARs

    Posted Tue May 28, 2024 08:07 AM

    Dear Jean-Farncois

    >>>> Sometimes unexpected LPM fails and stay in an inconsistency state, 100% cpmpleted but in hang state, forcing the Lpar to be stopped and restarting causing service unavailability. Any good advise ? <<<<

    I hope you are aware of this IBM Technote that describes how to gather data for analysis of LPM problem: Complete Guide To Must Gather LPM Data Collection on PowerVC, VIO, AIX, Linux and IBM i at https://www.ibm.com/support/pages/node/887093     

    As for my guess of what may cause your occasional LPM issue, my only guess is that it may have something to do with LPM performance which is discussed here: 

    Live Partition Mobility Performance at https://www.ibm.com/support/pages/live-partition-mobility-performance    

    Live Partition Mobility (LPM) Performance Tips and Results at  https://community.ibm.com/community/user/power/blogs/pete-heyrman1/2020/06/17/live-partition-mobility-lpm-performance-tips-and-r  

    Network Performance Recommendations for Live Partition Mobility at  https://www.linkedin.com/pulse/network-performance-recommendations-live-partition-jose-luis              

    networking best practices to implement and manage an LPM environment at  https://www.ibm.com/support/pages/best-practices-live-partition-mobility-lpm-networking      



    ------------------------------
    Satid S
    ------------------------------



  • 7.  RE: "deep" LPM validation of all our LPARs

    Posted Tue May 28, 2024 10:26 AM

    I was not aware of all those very important doc, we will go through them and come back if any explanation needed.

    Many thanks for your help.

    JFN.



    ------------------------------
    Jean-Francois Noel
    ------------------------------



  • 8.  RE: "deep" LPM validation of all our LPARs

    Posted Mon August 26, 2024 11:13 AM
    Edited by Pete Heyrman Mon August 26, 2024 01:48 PM

    Hello Jean-François

    Hopefully you remember me as it's been almost 3 years now that I am retired from IBM/Kyndryl?

    When I was still on-board, the IBM i team was used to use the "PowerVM LPM/SRR Automation tool" from Lab Services (https://www.ibm.com/support/pages/powervm-lpmsrr-automation-tool-launch-page).

    Maybe this tool is still available (I remember that IBM was providing a new version for each new POWER processor) and they are still using it? And maybe there is an option which could fulfill your need?

    Anyway, you can have a try to contact them!

    Hope everything is running fine for you and Kyndryl team.
    Regards



    ------------------------------
    Marc Rauzier
    ------------------------------