Primary Storage

 View Only
Expand all | Collapse all

Do FlashSystem 5200 systems support a mix of fabric and directly attached hosts at the same time ?

  • 1.  Do FlashSystem 5200 systems support a mix of fabric and directly attached hosts at the same time ?

    Posted Mon February 06, 2023 09:13 AM
    Hello Community and good day,

    my question is: can IBM FlashSystem 5200 systems ***temporarily*** support a mix of SAN switch-attached and direct connect host connections at the same time in a VMware Environment ?

    I am trying to figure out if the steps below will work in a scenario comprised of 1x SAN switch-attached IBM FlashSystem 5200 box with Fibre Channel front-end ports and 3x VMware ESXi servers. The final goal is to do end up with 3x direct connect VMware ESXi servers in order to decommission the old SAN switches with no downtime:

    ==================================================
    - place one ESXi host at a time in maintenance mode
    - unplug the HBAs from the Fibre Channel switches
    - plug the HBAs on the back of the IBM FlashSystem 5200
    - resume the ESXi host
    - repeat with the other hosts
    ==================================================

    I would like to stress that this will be a ***temporary*** solution.

    Any help will be greatly appreciated.

    Thanks and Regards,

    M.

    ------------------------------
    Massimiliano Rizzi
    ------------------------------


  • 2.  RE: Do FlashSystem 5200 systems support a mix of fabric and directly attached hosts at the same time ?
    Best Answer

    IBM Champion
    Posted Tue February 07, 2023 03:17 AM
    Hi Massimiliano,

    Simple answer is Yes. First you need to disable NPIV to support direct attached hosts on FS5200 which is enabled by default. 
       a) chiogrp -fctargetportmode transitional 0
       b) chiogrp -fctargetportmode disabled 0

    and continue your procedure step by step. You also need to check ESXi sees FS5200 volumes as Flash Disk (instead of HDD) and multipath algorithm must be Round Robin. We also recommend round robin iops limit set to 1. 

    Regards,




    ------------------------------
    Nezih Boyacioglu
    ------------------------------



  • 3.  RE: Do FlashSystem 5200 systems support a mix of fabric and directly attached hosts at the same time ?

    Posted Tue February 07, 2023 04:29 AM
    Hi Nezih,

    first of all thank you for taking the time to answer my post. It is very much appreciated.

    With regards to setting the round robin iops limit to 1, I understand that we will need to adjust the limit on each host using the procedure described at https://kb.vmware.com/s/article/2069356.

    I wish you a great rest of the day day ahead.

    Kind Regards,

    Massimiliano

    ------------------------------
    Massimiliano Rizzi
    ------------------------------



  • 4.  RE: Do FlashSystem 5200 systems support a mix of fabric and directly attached hosts at the same time ?

    IBM Champion
    Posted Tue February 07, 2023 04:45 AM
    Yes, it's right. You can also use our IBM FlashSystem and VMware Implementation and Best Practices Guide Redbook. 

    https://www.redbooks.ibm.com/abstracts/sg248505.html

    ------------------------------
    Nezih Boyacioglu
    ------------------------------



  • 5.  RE: Do FlashSystem 5200 systems support a mix of fabric and directly attached hosts at the same time ?

    User Group Leader
    Posted Tue February 07, 2023 04:33 AM

    Hi!  You do not need to (nor should you) remove NPIV for direct attach.

    That was a very early restriction that has since been removed.

    We support DA and SAN Attach simultaneously, even on the same HBA.



    ------------------------------
    Evelyn Perez
    ------------------------------



  • 6.  RE: Do FlashSystem 5200 systems support a mix of fabric and directly attached hosts at the same time ?

    IBM Champion
    Posted Tue February 07, 2023 04:43 AM
    This is great news. Need to update my best practices :)

    Thank you Evelyn

    ------------------------------
    Nezih Boyacioglu
    ------------------------------



  • 7.  RE: Do FlashSystem 5200 systems support a mix of fabric and directly attached hosts at the same time ?

    Posted Tue February 07, 2023 08:10 AM
    Hi there Evelyn/Nezih,

    thank you again for your time and for the information provided.

    It's always great to receive information from knowledgeable experts from the vendor :)

    Kind Regards,

    Massimiliano

    ------------------------------
    Massimiliano Rizzi
    ------------------------------



  • 8.  RE: Do FlashSystem 5200 systems support a mix of fabric and directly attached hosts at the same time ?

    Posted Wed January 03, 2024 09:57 AM

    Hello Evelyn, 

    I have a real case with a customer where the direct-attached and switched storage are used simultaneously but on different hosts. 

    The NPIV works well when we have a switch between the host and the storage. It seems the directly attached host is unstable when the host cannot access the dedicated storage port anymore, even if the NPIV failover to the second node is successful). The problem is random. Every time we enter/exit the canister from the service state - a different server out of the 6 cannot recover the paths back, and the HBA port cannot communicate with Storage.
    Only disconnect and connect the FC cable - release this "stuck" state.

    I saw the below statement from Hans Populaire:

    Posted Thu February 09, 2023 10:45 AM
    Edited by Hans Populaire Thu February 09, 2023 11:09 AM

    Valid remarks regarding NPIV.

    Most probably NPIV is already used today and if this is the case, it can stay as is. Disabling it will require SAN zonings updates and this is something you probably do not want to do.

    When you will be in a direct attach situation, ESX host will use the "NPIV" wwn also, but in this case I think that the benefit of NPIV will not apply for a direct attached host, compared to SAN attached connection : if  1 link host/storage goes down, it will stay offline for the host until the link has been restored.

    What do you think about that? If such an environment is tested to confirm the stability with both direct and switch simultaneously configured, 

    Regards, 







    ------------------------------
    Aleksandar Ivanov
    ------------------------------



  • 9.  RE: Do FlashSystem 5200 systems support a mix of fabric and directly attached hosts at the same time ?

    Posted Tue January 09, 2024 05:37 AM

    Hi Evelyn,

    Where should we look for these kind of changes for early restrictions?

    Reards,



    ------------------------------
    Istvan Buda
    buda.istvan@telekom.hu
    ------------------------------



  • 10.  RE: Do FlashSystem 5200 systems support a mix of fabric and directly attached hosts at the same time ?

    Posted Tue January 09, 2024 05:47 AM

    @Massimiliano Rizzi I agree with Nezih Boyacioglu - I have recently done a migration on direct attach cluster and left the NPIV enabled on the new FS5035 and had issues with VMWare 6.7 and VMWare 7 on the paths to the ESX hosts. 



    ------------------------------
    Frank da Silva
    ------------------------------



  • 11.  RE: Do FlashSystem 5200 systems support a mix of fabric and directly attached hosts at the same time ?

    Posted Wed February 15, 2023 11:58 AM

    Hello Community and good day,

    just a quick question here while configuring the new FS5200. First, this NVMe storage is a beast :)

    As part of tuning the ESXi hosts for optimal IBM FS5200 storage performance according to both the IBM FlashSystem and VMware Implementation and Best Practices Guide and what Nezih Said, we checked that each ESXi sees FS5200 volumes as Flash Disk (instead of HDD) and that multipath algorithm is Round Robin with an I/O Operation Limit value of 1.

    As a result, prior to presenting FS5200 volumes we manually added a custom claim rule to each ESXi host in the cluster in order to set the path selection limit and path selection policy settings using the command below:

    ==================================================
    esxcli storage nmp satp rule add -s VMW_SATP_ALUA -V IBM -M "2145" -c tpgs_on --psp="VMW_PSP_RR" -e "IBM arrays with ALUA support" -O "iops=1"
    ==================================================

    Afterwards we presented FS5200 volumes to each ESXi host and ran the "esxcli storage nmp device list" command on each ESXi host in order to confirm that presented FS5200 devices were claimed by the custom claim rule as expected.

    As soon as we changed presented FS5200 devices to Flash on each ESXi host (prior to creating the datastore on one ESXi host), the PSP for presented FS5200 devices automatically switched from to "VMW_PSP_RR" to "VMW_PSP_MRU".

    We ran the commands below on each ESXi host in order to revert back the PSP to "VMW_PSP_RR", however it appears that the presented FS5200 devices are now claimed by the default custom claim rule for IBM 2145 device, and not by the custom claim rule we added which sets the Round-Robin iops limit set to 1:

    ==================================================
    esxcli storage nmp device set --device naa.600507681281012be00000000000000d --psp VMW_PSP_RR esxcli storage nmp device set --device naa.600507681281012be00000000000000e --psp VMW_PSP_RR esxcli storage nmp device set --device naa.600507681281012be00000000000000f --psp VMW_PSP_RR ==================================================

    Definitely sounds like an issue on the VMware side, but I just wanted to check to see whether someone has already observed that.

    As usual, thank you in advance for your kind support.

    Thank you in advance for your kind support.



    ------------------------------
    Massimiliano Rizzi
    ------------------------------



  • 12.  RE: Do FlashSystem 5200 systems support a mix of fabric and directly attached hosts at the same time ?

    IBM Champion
    Posted Wed February 15, 2023 12:34 PM

    Hi Massimiliano, 

    I wrote that section in the redbook :)
    I have also observed that in some cases the multipath algorithm turn backs to MRU. While I don't know exactly why this happens, I prefer to list all claim rules and remove the rules that may be effective and may conflict. 

    Regards



    ------------------------------
    Nezih Boyacioglu
    ------------------------------



  • 13.  RE: Do FlashSystem 5200 systems support a mix of fabric and directly attached hosts at the same time ?

    Posted Fri February 17, 2023 10:39 AM

    Hi Nezih,

    thank you for your reply. It is very much appreciated.

    I'm glad to hear that other people have also observed that. Even though we will soon be upgrading all hosts to ESXi 7.0 Update 3, I have just filed a support case with VMware in order to take a deeper look into this issue.

    I will update this thread with the findings.

    Kind Regards,

    Massimiliano Rizzi



    ------------------------------
    Massimiliano Rizzi
    ------------------------------



  • 14.  RE: Do FlashSystem 5200 systems support a mix of fabric and directly attached hosts at the same time ?

    Posted Mon March 06, 2023 07:50 AM
    Edited by Massimiliano Rizzi Mon March 06, 2023 07:49 AM

    Hello  Community and good day, 

    apologies for my belatedfFollow-up.

    Looking at the satp rules from the log bundle I provided them with, VMware Technical Support is seeing that there are 2 custom rules one for the iops=1 & the other for the enable_ssd

    ==================================================
    VMW_SATP_ALUA                                              IBM       2145                                                             user        tpgs_on                              VMW_PSP_RR   iops=1       IBM arrays with ALUA support

    VMW_SATP_ALUA        naa.600507681281012be00000000000000d                                                 enable_ssd                  user
    VMW_SATP_ALUA        naa.600507681281012be00000000000000e                                                 enable_ssd                  user
    VMW_SATP_ALUA        naa.600507681281012be00000000000000f                                                 enable_ssd                  user
    VMW_SATP_ALUA        naa.600507681281012be000000000000010                                                 enable_ssd                  user
    ==================================================

    Because of that, they asked me to remove the existing custom rule and add a generic rule with both options (iops=1) && (enable_ssd) ? 

    ==================================================
    esxcli storage nmp satp rule add -s VMW_SATP_ALUA -V IBM -M "2145" -c tpgs_on --psp="VMW_PSP_RR" -e "IBM arrays with ALUA support" -O "iops=1" --option "enable_ssd"
    ==================================================

    Although I haven't tried this yet, it does make sense to me to add a generic rule with both options (iops=1) && (enable_ssd) in order to avoid conflicting SATP Rules.

    What do you think about that ?

    Thanks and Regards,



    ------------------------------
    Massimiliano Rizzi
    ------------------------------



  • 15.  RE: Do FlashSystem 5200 systems support a mix of fabric and directly attached hosts at the same time ?

    Posted Thu February 09, 2023 10:45 AM
    Edited by Hans Populaire Thu February 09, 2023 11:09 AM

    Valid remarks regarding NPIV.

    Most probably NPIV is already used today and if this is the case, it can stay as is. Disabling it will require SAN zonings updates and this is something you probably do not want to do.

    When you will be in a direct attach situation, ESX host will use the "NPIV" wwn also, but in this case I think that the benefit of NPIV will not apply for a direct attached host, compared to SAN attached connection : if  1 link host/storage goes down, it will stay offline for the host until the link has been restored.

     



    ------------------------------
    Hans Populaire
    ------------------------------