IBM FlashSystem

 View Only
  • 1.  FS7300 PBHA maintenance on the Ethernet core switch

    Posted 15 days ago

    The customer has 2 IBM Flashsystem 7300 with HA policy configured via FC network.

    The customer plans to perform maintenance on the Ethernet core switch. Therefore, routing will not work. Accordingly, the HA cluster will lose access to the IP quorum. And IP access between the FS7300 will also be lost.

    How bad is it if the partnership between the systems was established via FC network? And should one of the systems be put into maintenance mode?



    ------------------------------
    A. S.
    ------------------------------


  • 2.  RE: FS7300 PBHA maintenance on the Ethernet core switch

    Posted 14 days ago

    Hello, shortly after the IP maintenance starts the systems will likely lose access to the IP quorum application(s) as you say. This will cause an event to be logged on both systems which you can ignore as it is expected. HA will keep running but it will switch into a mode where the management system for each partition will automatically win any quorum race if the systems lose FC connectivity to each other. Once the maintenance completes, and the systems reconnect to quorum then it will switch back into the normal mode; you can validate this as the event about missing quorum will automatically be marked as fixed. 

    One thing to note is that some management actions within the partition will fail while the IP connectivity to the HA partner is missing. This is because the management system cannot reflect configuration changes to the HA partner and the design is to favour keeping HA active over permitting management actions.



    ------------------------------
    Chris Bulmer
    ------------------------------



  • 3.  RE: FS7300 PBHA maintenance on the Ethernet core switch

    Posted 13 days ago
    Edited by A. S. 13 days ago

    Hello!

    Can you explain in more detail what happens if the systems lose connection via FC?
    How will this affect the availability of volumes to hosts?
    I just don't fully understand.

    Did I understand correctly that in case of loss of communication between systems via FC, the management system will operate with host i/o? And second system will go into standby mode? 

    ------------------------------
    A. S.
    ------------------------------



  • 4.  RE: FS7300 PBHA maintenance on the Ethernet core switch

    Posted 13 days ago

    Your understanding is good. In the scenario that the systems lose connectivity to each other over Fibre Channel for more than a few seconds (it will ride through link glitches), then it will start a quorum race. If quorum was missing prior to this point then the answer is predetermined and the current management system will win the race; the volumes will remain online at the management system (and configuration changes would be permitted on that system) and the volumes will be offline on the non-management system. If there are multiple HA partitions on the systems then these will each continue running on their management system (different partitions can have different preferences of management system).

    When the link is re-established, HA will start to automatically resynchronise and it will roll-forward any configuration changes that happened while disconnected. 

    Any non-HA volumes on either system are not affected by the link disconnecting. 



    ------------------------------
    Chris Bulmer
    ------------------------------



  • 5.  RE: FS7300 PBHA maintenance on the Ethernet core switch

    Posted 14 days ago

    Hello A. S.

    Regarding the IP quorum, The FS7300 has also local quorums on its internal disks and the IP quorum has a special role regarding the HA clustering and possible split brain situations. So if it is not accessible there will be no impact on the HC cluster, but you will have some error messages in the event log. After the maintenance on the core switches this must be solved. 

    When the PBHA is based on FC infrastructure, then there will be also no impact. You may get some events if the ethernet ports of the FS7300 are affected, but the HA will continue to work. 

    Just make sure that you have documented the state before the maintenance and after it take care that the events/errors in the systems get solved.



    ------------------------------
    Dorde Knezevic
    ------------------------------



  • 6.  RE: FS7300 PBHA maintenance on the Ethernet core switch

    Posted 14 days ago

    Hi A. S.,

    as to my understanding, there is no need to put either system into "maintenance mode", whatever you exactly this was supposed to mean.

    It would be sufficient, to suspend the partnership between the two systems instead. It can be resumed, once the network maintenance works were completed.

    The replication will catch up from where it was stopped, the changes will be tracked by means of a bitmap.



    ------------------------------
    Best regards, 

    Christian Schroeder
    IBM Storage Virtualize Support with Passion
    ------------------------------



  • 7.  RE: FS7300 PBHA maintenance on the Ethernet core switch

    Posted 14 days ago

    Hello.  Chris Bulmer has already answered the question in full, but I thought providing a useful reference in the online documentation still might be helpful.  For HA+DR (3-site replication), there's an 'Example error scenarios' table here:

    https://www.ibm.com/docs/en/flashsystem-7x00/8.7.x_cd?topic=replication-high-availability-disaster-recovery-3-site

    Although that's written for a 3-site configuration, the information in the high availability column is still valid for 2-site HA if you look at just scenarios (rows) relevant to HA.  For your post, there are two rows that are relevant:

    • One or both of the HA systems lose connectivity to all quorum applications
    • Loss of management IP connectivity only between the HA systems

    You'll see the information there matches the answer ChrisB provided.



    ------------------------------
    Chris Canto
    Software Developer
    IBM
    Hursley
    ------------------------------