Hopefully you remember me as it's been almost 3 years now that I am retired from IBM/Kyndryl?
When I was still on-board, the IBM i team was used to use the "PowerVM LPM/SRR Automation tool" from Lab Services (https://www.ibm.com/support/pages/powervm-lpmsrr-automation-tool-launch-page).
Maybe this tool is still available (I remember that IBM was providing a new version for each new POWER processor) and they are still using it? And maybe there is an option which could fulfill your need?
Hope everything is running fine for you and Kyndryl team.
Regards
Original Message:
Sent: Tue May 28, 2024 10:25 AM
From: Jean-Francois Noel
Subject: "deep" LPM validation of all our LPARs
I was not aware of all those very important doc, we will go through them and come back if any explanation needed.
Many thanks for your help.
JFN.
------------------------------
Jean-Francois Noel
Original Message:
Sent: Tue May 28, 2024 08:06 AM
From: Satid S
Subject: "deep" LPM validation of all our LPARs
Dear Jean-Farncois
>>>> Sometimes unexpected LPM fails and stay in an inconsistency state, 100% cpmpleted but in hang state, forcing the Lpar to be stopped and restarting causing service unavailability. Any good advise ? <<<<
I hope you are aware of this IBM Technote that describes how to gather data for analysis of LPM problem: Complete Guide To Must Gather LPM Data Collection on PowerVC, VIO, AIX, Linux and IBM i at https://www.ibm.com/support/pages/node/887093
As for my guess of what may cause your occasional LPM issue, my only guess is that it may have something to do with LPM performance which is discussed here:
Live Partition Mobility Performance at https://www.ibm.com/support/pages/live-partition-mobility-performance
Live Partition Mobility (LPM) Performance Tips and Results at https://community.ibm.com/community/user/power/blogs/pete-heyrman1/2020/06/17/live-partition-mobility-lpm-performance-tips-and-r
Network Performance Recommendations for Live Partition Mobility at https://www.linkedin.com/pulse/network-performance-recommendations-live-partition-jose-luis
networking best practices to implement and manage an LPM environment at https://www.ibm.com/support/pages/best-practices-live-partition-mobility-lpm-networking
------------------------------
Satid S
Original Message:
Sent: Mon May 27, 2024 03:25 AM
From: Jean-Francois Noel
Subject: "deep" LPM validation of all our LPARs
Hello ,
Thanks to you Satid and to Andrey, for your answers,
I fully agree with your remarks, best way to test LPM is to do it in real life ;) .
We are planning to do once a year a LPM from one site to the other for all our production system (DRP test exercise) , this is in plan.
But we are also regularly doing LPM for workload rebalancing or frame maintenance.
Sometimes unexpected LPM fails and stay in an inconsistency state, 100% cpmpleted but in hang state, forcing the Lpar to be stopped and restarting causing service unavailability.
Any good advise ?
Have a good day.
------------------------------
Jean-Francois Noel
Original Message:
Sent: Fri May 24, 2024 09:54 PM
From: Satid S
Subject: "deep" LPM validation of all our LPARs
Dear Jean-Francois
From Andrey's response that checking with command does not always mean the actual mobility will succeed, why not routinely do LPM (or offline mobility if possible and during low workload period) for a number of high priority LPARs at a time over a year time span? Quite a few of my customers who have DR systems move their workload between PRD and DR systems (not necessarily using LPM) every 6 months or so to ensure the switchover really works because this is the only best way to know for sure that it works. I see it is sensible for you to do this at least for high priority LPARs, if not for all.
------------------------------
Satid S
Original Message:
Sent: Fri March 15, 2024 06:24 AM
From: Jean-Francois Noel
Subject: "deep" LPM validation of all our LPARs
Hello,
We are having more than 200 lpars (Aix / Suse) distributed among near 30 power. E850, E950, E980, E1080
We would like to put in place a weekly (pre)validation of LPM of all our lpar to be sure in case of problem to be able to LPM them without any problem.
All our LPARs are virtualized, powerVM at the latest level as for HMC.
What will be the right commandline to use to "fully/deeply" test the LPM validation.
> migrlpar -o v -m [source cec] -t [target cec] -p [lpar to migrate] perhaps additional options may be relevant for addational validation
If few days/veeks we will deploy SRR , perhaps a similar command
> rrstartlpar -o validate -m <source server> -t <destination server> -p <lpar name> | --id <lpar id> ... additional options ?
many thanks for your support
------------------------------
JF Noel
Kyndryl Architect
------------------------------