PowerHA for AIX

 View Only
  • 1.  Cannot start service after stopping service in unmanaged

    Posted Tue November 21, 2023 10:05 AM

    Goodday

    This morning , due to LPM of the primary cluster node , i stopped cluster services in unmanaged mode.

    After the lpm back to the original frame and when starting cluster services again I get:

    cl_rc.cluster: Error: Changes have been made to the Cluster Topology or Resource

    configuration. The Cluster Configuration must be synchronized before

    starting Cluster Services.

    But when I start synchronisation I get the message that it can not be done due to unmanaged status......

    cluster verification shows no error

    Version:

    # clhaver

    Node xxxxxxxx has HACMP version 7242 installed

    Node xxxxxxxx has HACMP version 7242 installed

    Is there a workaround ? , do not want to stop the cluster obviously .

    thank you,

    Regards,

    Michel.



    ------------------------------
    Michel de Kraker
    ------------------------------


  • 2.  RE: Cannot start service after stopping service in unmanaged

    Posted Wed November 22, 2023 06:17 PM

    I understand that unmanaged Resource Groups can't be again managed by PowerHA (a big weakness I think)

    I think it is because in order to "know" everything is OK, PowerHA needs start RGs from the scratch,,, it can't be able to know what were you been doing since the RG was unmanaged.

    Think, you must restart cluster before RG could be started again.

     

    Regards

     

    Luis A. Rojas Kramer

     






  • 3.  RE: Cannot start service after stopping service in unmanaged

    Posted Thu November 23, 2023 01:09 AM

    Hi Luis

    I have put the RG in unmanaged mode. Then LPM the lpar to another P9. Within 3 hours I lpmmed the lpar back to the original P9 without changing the clusterconfig off course. And now I have to restart the cluster.

    It is how it is, but for this expensive SW , it is unbelievable this is happening.

    Regards

    Michel.



    ------------------------------
    Michel de Kraker
    ------------------------------



  • 4.  RE: Cannot start service after stopping service in unmanaged

    Posted Thu November 23, 2023 02:15 AM
    As I understand, yes, you made a lot of changes when you lpmmed your LPAR forward and back between servers.
    PowerHA can detect changes of LPAR ID, system ID, devices logical ids, volume groups timestamp and many others.

    From PowerHA POV one change is enough to say it is not the same.
    Sorry, but it is as it works.

    You could think this should not happen with an expensive software, but I think it could be very very very complicated to PowerHA guarantee the cluster stability just taking all your RG as you have been using it, unknowing real status of all RG's resources.. 
    Maybe in the future versions we can see fixed this, but i doubt it.

    Luis Rojas







  • 5.  RE: Cannot start service after stopping service in unmanaged

    Posted Thu November 23, 2023 02:25 AM

    Hi Luis

    I lpmmed 9 primary POWERHA nodes.

    5 of them were OK after LPM back to original P9 .

    4 out of 9 have this issue not able to put backup in managed mode.

    And all primary POWERHA were lpmmed to the same P9 and back.

    The only thing changed was the FW on both P9s. 

    If this was a problem all 9 lpars should have the same issue.

    But so it be. I will plan downtime for 4 clusters.....

    Regards

    Michel.



    ------------------------------
    Michel de Kraker
    ------------------------------



  • 6.  RE: Cannot start service after stopping service in unmanaged

    Posted Fri November 24, 2023 05:58 AM
    Hi Michel,

    As per the PowerHA admin guide (Link below), during LPM, you should be able to stop PowerHA in UNMANAGE state and resume the services again after LPM is done (please read page #34 LPM Node Policy):


    So your issue is probably a defect of some sort. Your PowerHA level is out of service pack support since April this year and an upgrade to a supported level could have fixed this issue. anyhow, and If you wish, you can open a case with IBM support to understand why did that happen if there is available data to collect about this issue for RCA.

    Regards,
    Mostafa M Mahmoud
    PowerHA / CAA / VMRM / RSCT Development Support Engineer
    IBM Client Innovation Center - Cairo, Egypt








  • 7.  RE: Cannot start service after stopping service in unmanaged

    Posted Thu November 23, 2023 02:54 AM
    Edited by Andrey Klyachkin Thu November 23, 2023 02:55 AM

    I understand that unmanaged Resource Groups can't be again managed by PowerHA (a big weakness I think)

    https://www.ibm.com/docs/en/powerha-aix/7.2?topic=pscs-starting-cluster-services-node-resource-group-in-unmanaged-state



    ------------------------------
    Andrey Klyachkin

    https://www.power-devops.com
    ------------------------------



  • 8.  RE: Cannot start service after stopping service in unmanaged

    Posted Thu November 23, 2023 03:00 AM

    Hi Michel,

    • every expansive software from IBM has support. We don't know your PowerHA/AIX/Hardware configuration and I don't think you want to upload snap to the forum. Open the case, IBM will give you a workaround
    • what I see, you have PowerHA 7.2.4 - not the newest version! I don't remember since which version PowerHA supports LPM, but the support was not from the very beginning. It might be that you have too old version. Yes, if it is not supported, it might work.
    • you have two cluster nodes and it looks like they have different states. If you didn't do any changes to PowerHA configuration, try to sync from another node which was not LPMed.


    ------------------------------
    Andrey Klyachkin

    https://www.power-devops.com
    ------------------------------