IBM i Global

 View Only
Expand all | Collapse all

IBM i performance with vSCSI external storage

  • 1.  IBM i performance with vSCSI external storage

    Posted Tue April 30, 2024 01:34 PM

    Hello everyone!

    We have started a PoC to test the connection between Pure Storage and our "IBM i" environment.

    Our environment consists of IBMi lpars connected to V7000 storage via NPIV (with tiering enabled in three tiers: nl-SAS, SAS and SSDs). We also have Linux/SLES/SAP lpars (or VMs), but they won't be part of the PoC at the moment.

    Unfortunately, there is no communication between IBM i and PureStorage via NPIV, so with the resources we currently have, we opted for a configuration with SAN networking between PureStorage and the VIOS servers, and then via vSCSI with the IBM i lpars.

    Four 8Mbps FC physical ports were zoned, two for each VIOS, each on a SAN fabric.

    We created ONE virtual SCSI server adapter in each VIOS to communicate with ONE SCSI client adapter at IBM i partition.

    With this configuration, we achieved slower response times than with our v7000 storage (in fact, we kept two identical partitions in terms of processor and memory, one for v7000 and one for PureStorage for this PoC). We ran tests on batch and online business process execution, we also compared response times on IPLs and also on disk to disk backup operations (SAVLIB and SAVOBJ to *savf). In all cases, we obtained the same or worse times with PureStorage. This was a bit of a surprise, given that Pure Storage is all-flash, while our v7000 is 'hybrid' storage (mechanical discs and SSD - which in itself is inferior to flash I/O perf).

    One detail that caught our attention was the output of the WRKDSKSTS command, where the "% Busy" column was always above 50%, quite unusual in our experience with the v7000, which very rarely exceeds 15%.

    To remove any doubt about some kind of contention and/or capacity limit on the virtual adapters, we created 3 more virtual SCSI adapters in each VIOS. We finished the tests with 04 adapters in each VIOS and 08 in the IBMi partition, leaving only 02 LUNs in each adapter, but the times were even worse, not very representative, but worse than with only 01 adapter per VIOS and 02 (one for each VIOS) in the IBMi partition.

    Has anyone experienced this situation? Is this really the expected result?

    Best regards!



    ------------------------------
    ===============
    Marcos Daniel Wille
    ===============
    ------------------------------


  • 2.  RE: IBM i performance with vSCSI external storage

    Posted Wed May 01, 2024 04:18 AM
    Edited by Satid S Wed May 01, 2024 04:27 AM

    Dear Marcos

    You should provide more HW info such as Power server model, VIOS and IBM i release, FC of the fiber card used for VIOS. IBM i LPAR and VIOS HW resource allocation and such.   Did you allocate "dedicated" CPU (as opposed to CPU from a Shared Processor Pool) to VIOS?  VIOS likes to use its own pool of CPU.

    You should use IBM i Performance Data Investigator charts on Disk performance (Disk Throughput Overview for Disk Pools, Physical Disk IO Overview - Basic) to compare between the existing IBM i LPAR and the PoC one.     PDI also provides chart on Physical System --> CPU Utilization of all LPARs that you enable to allow performance data collection (a parameter in LPAR profile). But this one does not let you know memory faulting in VIOS LPAR.

    One possible cause of the problem you encountered is that the VIOS that serves vSCSI to IBM i LPAR (or any OS LPAR at all) may have limited resources - CPU and memory. Since you did not provide such VIOS information, why not try adding more CPU and memory to VIOS and see if the problem goes away or not.   You should also use VIOS Performance Advisor tool ( https://www.ibm.com/docs/en/power9/9223-42S?topic=advisor-virtual-io-server-performance-reports ) to check its HW resource utilization during your PoC test period as well. It helps give you information whether you allocate proper CPU and memory to VIOS or not.  You will also need it when you add more client LPARs to VIOS (even with NPIV).   

    Next important factor that you did not mention is how the Pure Storage's LUNs are created over its physical disk unit and how you map each LUN to vSCSI unit for IBM i to use.  And how many vSCSI disk units did you create for IBM i to use? IBM i likes a lot of disk units (physical or not).

    >>>> We finished the tests with 04 adapters in each VIOS and 08 in the IBMi partition, leaving only 02 LUNs in each adapter, but the times were even worse <<<<

    The more virtual SCSI adapters and vSCSI disk units (and vFC for NPIV) you create in VIOS for a client LPAR, the more memory (and perhaps a bit more CPU) you need to allocate to VIOS.  Did you do this?    Your "2 LUNs in each adapter" also sounds weird to me. Assuming you use POWER8 CPU or later one, I see you should use just 1 vSCSI adapter + at least 5 LUNs (or a bit more if your workload is disk IO intensive at peak workload period) for each CPU you allocate to IBM i LPAR (or 2 vAdapter for fault tolerance) that connects to Flash Disk storage. . 



    ------------------------------
    Satid S
    ------------------------------



  • 3.  RE: IBM i performance with vSCSI external storage

    Posted Wed May 01, 2024 06:16 AM

    Hi Marcos,

    IBMi can't use storage provisioned from Pure Storage arrays, using NPIV. This is because Pure Storage arrays do not have the required emulation/driver that IBMi LIC can recognize as a valid storage model/type.

    Just to add to the extensive answer from Satid, you are fully depended on the VIOS for the best performance settings. I suggest you implement the Pure Storage ODM on each VIOS for best compatibility, in case you haven't done it yet. - https://support.purestorage.com/bundle/m_ibm/page/Solutions/IBM/AIX/topics/concept/c_aix_recommended_settings.html

    As Satid mentioned, you need to be sure that you have enough CPU+memory on each VIOS if you are driving a lot of IOPS.

    But in all cases, the rule of thumb is that IBMi+ native NPIV will always procedure a bit better performance compared to vSCSI. This article explains in details why that is the case - https://www.linkedin.com/pulse/fight-rages-npiv-vs-vscsi-fredrik-lundholm/

    Reagrds,

    Tsvetan



    ------------------------------
    Tsvetan Marinov
    ------------------------------



  • 4.  RE: IBM i performance with vSCSI external storage

    Posted Wed May 01, 2024 08:20 AM
    Edited by Satid S Wed May 01, 2024 08:25 AM

    Dear Marcos and Tsvetan

    Tsvetan provided an incorrect URL.  The correct one is this: https://support.purestorage.com/bundle/m_ibm/page/Solutions/IBM/AIX/topics/concept/c_ibm_powervm_with_flasharray.html    In this web page, look for a hot link to download a PDF file paper named IBM PowerVM with FlashArray and follow the instruction in it.   You may consider if using ODM as described in the paper is needed for your case or not. One important parameter to set is Queue Dept of Pure Storage. 



    ------------------------------
    Satid S
    ------------------------------



  • 5.  RE: IBM i performance with vSCSI external storage

    Posted Wed May 01, 2024 07:44 AM

    I'll add my vote for VIOS performance as the most likely issue.  Starving your VIOS for resources will have a major impact on your client LPAR I/O performance, especially with vSCSI for IBM i doing the sector size (520/512) mapping,

    Using the IBM SAN Volume Controller (SVC) is another option for any third party storage not directly supported for IBM i.  In this case, you would attach your Pure luns to the SVC as external storage, and use the SVC to present vdisks/luns to IBM i via NPIV.   V7000 is in the SVC/Spectrum Virtualize family -- You might be able to test this configuration with your existing V7000.



    ------------------------------
    Vincent Greene
    IT Consultant
    Technology Expert labs
    IBM
    Vincent.Greene@ibm.com


    The postings on this site are my own and don't necessarily represent IBM's positions, strategies or opinions.
    ------------------------------



  • 6.  RE: IBM i performance with vSCSI external storage

    Posted Thu May 02, 2024 10:50 AM
    Hello everyone,
      More details about PoC:
      
      Satid:
        This PoC is running on a Power8 server, 8286-42A (our D/R environment).
        In this server we have two VIOS, on level: 3.1.4.31, with 01 dedicated processor core and 16 GB RAM each.
        Each VIOS has 3 FC cards (feature code: 5273, 5735, 577D, EL2N, EL58), FW on last level too.
        The IBM i is on V7R4 TR8 (CUMULATIVE PTF PACKAGE C3117740), with 01 Capped Shared Processor and 01 Virtual Processor and 128 GB of RAM.
        During testing, I monitor VIOS CPU with NMON, and they don't exceed 10% of usage. But, % of disk usage is about 50% (half for write and half for read operations).
        Regarding the creation of LUNs in Pure Storage, we don't monitor it, we just request the creation of 08 LUNs with 640 GB.
        From what we were told, the process is similar to the V7000: the LUNs are created and presented to a "host" that points to the WWPNs of VIOS FC . The SAN is then zoned to present the LUNs to the VIOS.
    To deliver them to the IBM i partition, we used the "Virtual Storage Management" menu in the HMC and directly mapped the "physical" volumes to the vSCSI device ID of the IBMi lpar, this was done in each of the VIOSes.
      Tsvetan:
        Following the recommendation of the Pure staff (and also IBM), we installed the ODM for Pure on both VIOS, and all disks shipped by Pure use this driver, as you can see below (output of "lsdev -Cc disk"):
          hdisk2 Available 01-00-01 PURE MPIO disk (Fiber)
          hdisk3 present 01-00-01 PURE MPIO Drive (Fibre)
          hdisk4 present 01-00-01 PURE MPIO Drive (Fiber)
          hdisk5 Available 01-00-01 PURE MPIO Drive (Fiber)
          hdisk6 Available 01-00-01 PURE MPIO Drive (Fiber)
          hdisk7 Available 01-00-01 PURE MPIO Drive (Fiber)
          hdisk8 Available 01-00-01 PURE MPIO Drive (Fiber)
          hdisk9 Available 01-00-01 PURE MPIO Drive (Fiber)
        We've just double-checked the recommended parameters in the links you sent us, and they're all as recommended
        
    Best Practice for AIX :
          Install ODM Modification on VIO Server and AIX and validate that the following parameter are set properly
           • Algorithm to Shortest_queue : chdev -l hdiskX -a algorithm=shortest_queue
           • Failover to Fast_fail: "chdev -l fscsi0 -a fc_err_recov=fast_fail –P"
           • dyntrk "chdev -l fscsi0 -a dyntrk=yes –P"
           • queue_depth=256 (set by Pure ODM module)
           • max_transfer to 0x400000
    Vincent:
        Using the v7000 to virtualize was our initial idea, even, to "clone" the LUNs we would need, but the Pure people told us it wouldn't be possible with the v7000, so we didn't delve into that solution.
    Regards,


    ------------------------------
    Marcos D. Wille
    ------------------------------



  • 7.  RE: IBM i performance with vSCSI external storage

    Posted Thu May 02, 2024 01:33 PM

    Hi,

    Is this 8 lun of 640 GB (8*640) or 8 lun of 80GB ?

    Regards 



    ------------------------------
    Virgile VATIN
    ------------------------------



  • 8.  RE: IBM i performance with vSCSI external storage

    Posted Thu May 02, 2024 02:50 PM

    Hello Virgile,

      There are 8 luns of 640 GB each, totaling ~5TB.



    ------------------------------
    Marcos D. Wille
    ------------------------------



  • 9.  RE: IBM i performance with vSCSI external storage

    Posted Thu May 02, 2024 06:24 PM

    Hum, 

    don't know if it can solve your issue, but from different IBM university on external storage (and for internal storage), IBM i works better with more arms (disks). it's better to have a lot of small disk than few big disk for example 50 disk of 100 GB run better than 8 * 640 GB. this remains true with Flash Core Module. 

    In wrksysact what do you see (job) and I/O? 

    do you have the same config on your V7000 ?  

    Regards



    ------------------------------
    Virgile VATIN
    ------------------------------



  • 10.  RE: IBM i performance with vSCSI external storage

    Posted Fri May 03, 2024 07:23 AM

    Hello Virgile,

      We have created a controlled environment for PoC and both partitions have the same disk distribution, i.e. both with 08 LUNs of ~640 GB.  

      The primary objective is not exactly to have the best possible performance, but to compare the performance of two identical lpars in terms of resources (processor and memory) using two different storage solutions, one searching for disks in Pure and the other searching in a scenario that we are already used to, in this case, disks in v7000. We were sure that the performance on flash storage (Pure) would be much better, but we were surprised by this result.

      Yesterday, using the VIOS Performance Advisor tool, as suggested by Satid, we noticed a "strange" behavior in the case of Pure, where, it seems, all the I/O is coming out of just one of the FC ports, despite having 04 ports configured, and, in this one port, occupancy reaches 100%. It could be a problem in the multipath of Pure's ODM. We'll study this a bit more and I'll share the results.

    VIOS01 - Performance Advisor

    VIOS 02 - Performance Advisor


    ------------------------------
    Marcos D. Wille
    ------------------------------



  • 11.  RE: IBM i performance with vSCSI external storage

    Posted Fri May 03, 2024 09:39 AM
    Edited by Satid S Fri May 03, 2024 09:47 AM

    Dear Marcos

    >>>> Yesterday, using the VIOS Performance Advisor tool, as suggested by Satid, we noticed a "strange" behavior in the case of Pure, where, it seems, all the I/O is coming out of just one of the FC ports, despite having 04 ports configured, and, in this one port, occupancy reaches 100%.  <<<<

    It appears you may now be closer to identifying the cause of disk performance issue which is likely to be that MPIO is not working.   Please check out this IBM i Technote to see if it is useful in helping you do the check or not: How to verify IBM i Disk Multipath Status at https://www.ibm.com/support/pages/how-verify-ibm-i-disk-multipath-status.  Comparing what you see in PoC LPAT against that in your DR LPAR may help with the checking.  

    You should create 4 (or at least 2) client-side vSCSI adapters on IBM i side for multipathing.  I remember this is automatic for recent releases of VIOS when you create a server side vSCSI adapter and this is true for your case.  But I cannot find any info on whether we need to additionally configure multipath in IBM i for vSCSI adapters from VIOS.  You should try find a redbook on VIOS and its vSCSi support.



    ------------------------------
    Satid S
    ------------------------------



  • 12.  RE: IBM i performance with vSCSI external storage

    Posted Thu May 02, 2024 10:32 PM
    Edited by Satid S Thu May 02, 2024 10:38 PM

    Dear Marcos

    Your WRKDSKSTS screen shows about 1,000 IOPS for each of the disk unit which indicates disk IO intensive workload. But for Flash Disk, 1,000 IOPS should not cause % Busy to be at 40% level.  I suspect the Queue Depth of the disk server is not set properly.

    How many disk units are there in your DR IBM i LPAR?  If there are more than 8 units that you use in your PoC IBM i LPAR, this can be one contributing cause of the issue.

    What about memory allocation in your DR and PoC LPARs?  I hope they are similarly allocated.

    What about enabling and setting Queue Depth parameter of PureStorage ODM definition as indicated in PureStorage paper section Best Practice Recommendation for MPIO?  In my past experience, specifying a proper value of Queue Depth (the paper indicates a value of 256) for SAN disk helped solve disk IO performance issue in several cases I was involved. The high disk % Busy may be caused by Queue Depth not set a value at all.   (Check if you need to restart PureStorage or not after the value change.)     If you do NOT use ODM at all, I would say you need to use it to exploit the Queue Depth parameter.



    ------------------------------
    Satid S
    ------------------------------



  • 13.  RE: IBM i performance with vSCSI external storage

    Posted Fri May 03, 2024 08:28 AM

    Hi Satid,

      Both lpars are identical in processor and memory distribution, i.e. both with 01 shared capped processor, 01 VP, 08 LUNs * 640 GB = ~5.0 TB.
      QPFRADJ is locked ( = 0) and the memory pools are identical on both partitions.

    WRKSYSSTS on V7000 lpar (named SANHML2)

    WRKSYSSTS on Pure lpar (named SANPURE)

      We're using Pure's ODM, with all the parameters set to the value they recommend, including queue depth.

    As I wrote in my reply to Virgil, I found a strange behavior in VIOS Perf Advisor, it seems that ALL disks are using only one of the FC ports to communicate with Pure, I've attached the screenshots of Vios Perf Advisor, let's investigate this further.
    Best regards,


    ------------------------------
    Marcos D. Wille
    ------------------------------



  • 14.  RE: IBM i performance with vSCSI external storage

    Posted Fri May 03, 2024 10:26 AM

    Hi Marcos,

    If convenient for you, maybe you can re-try the PoC using 16x320GB LUN's instead of 8x640GB. This will for sure better for IBMi due to the fix queue depth that IBMi LIC driver is using.

    Also check how many active paths you have trough the fabrics using "lsmpio" and "lsmpio -ar". This will give you an idea if the MPIO algorithm is actually working, by distributing the active connections, though each FC adapter.

    Regards,
    Tsvetan



    ------------------------------
    Tsvetan Marinov
    ------------------------------



  • 15.  RE: IBM i performance with vSCSI external storage

    Posted Fri May 03, 2024 10:35 AM
    Ok, sorry for this question but what about your FC connections and zoning from vios and to pure storage ?





  • 16.  RE: IBM i performance with vSCSI external storage

    Posted Fri May 03, 2024 10:59 AM

    but from what I've already configured as lpar with vscsi and ibm i, I used to configure the queue_depth to 32. May be you can give a try.

    Regards 





  • 17.  RE: IBM i performance with vSCSI external storage

    Posted Fri May 03, 2024 11:47 AM

    Marcus, following Virgile's line of thought, from the IBM Flashsystem Best Practices and Performance Guidlines red book, Appendix A: https://www.redbooks.ibm.com/redbooks/pdfs/sg248503.pdf

    Ibm remove preview
    View this on Ibm >

    Not sure if this applies to PureStorage also....

    Defining LUNs for IBM i
    LUNs for an IBM i host are defined from IBM Spectrum Virtualize block-based storage. They
    are created from available extents within a storage pool, the same way as for open system
    hosts.
    Even though IBM i supports a usable LUN size of up to 2 TB - 1 byte for IBM Spectrum
    Virtualize storage, using only a few large size LUNs for IBM i is not recommended for
    performance reasons.
    In general, the more LUNs that are available to IBM i, the better the performance. The
    following are the reasons for this:
     If more LUNs are attached to IBM i, storage management uses more threads and
    therefore enables better performance.
     More LUNs provide a higher I/O concurrency which reduces the likelihood of I/O queuing
    and therefore the wait time component of the disk response time resulting in lower latency
    of disk I/O operations.
    For planning, consider that a higher number of LUNs may also require more physical or/and
    virtual FC adapters on IBM i based on the maximum number of LUNs supported by IBM i per
    FC adapter port.
    The sizing process helps to determine a reasonable number of LUNs required to access the
    needed capacity, while meeting performance objectives. Regarding both these aspects and
    the preferred practices, our guidelines are as follows:
     For any IBM i disk pool (ASP) define all the LUNs as the same size.
     40 GB is the preferred minimum LUN size.
     You should not define LUNs larger than about 200 GB.
     A minimum of 8 LUNs for each ASP is preferred for small IBM i partitions and typically a
    couple of dozen LUNs for medium and up to a few hundreds for large systems.
    When defining LUNs for IBM i, consider the following required minimum capacities for the
    load source (boot disk) LUN:
     With IBM i release 7.1, the minimum capacity is 20 GB
     With IBM i release 7.2 before TR1, the minimum capacity is 80 GB in IBM i
     With IBM i release 7.2 TR1 and later, the minimum capacity is 40 GB in IBM i



    ------------------------------
    Robert Weyer
    ------------------------------



  • 18.  RE: IBM i performance with vSCSI external storage

    Posted Mon May 06, 2024 06:33 AM

    Dear Marcos,

    I'm not surprised by these results!

    We tend to forget the basic operation of the IBM i, i.e. operation with 520-byte sectors; reading blocks of 8 520-byte sectors results in the reading of 9 512-byte sectors; it's possible that the PureStorage array doesn't manage this type of I/O correctly, as we had a few years ago with disk cards where IBM's recommendation was "to limit I/O, disable the card cache ...".

    The tests carried out with 4 SCSI adapters and 2 Luns per adapter are useless; neither is adding memory to the VIOS as suggested: in SCSI connections, there is no data transfer between the IBM i partitions and the VIOS, only the requests are transferred because they are modified by the VIOS; the data buffers remain in the IBM i partitions, unlike NPIV connections where you need to be able to receive data for several virtual FC adapters, so you need to have the appropriate buffer on the VIOS side.

    On the other hand, there is a very important parameter for the disks, queue_depth, which must be set to 32 (this value corresponds to what is defined on the IBM i side - it cannot be changed); it seems to me that the current default value is 20; in the past, it was 8. You can have 512 simultaneous I/0 on a SCSI adapter, which means you can manage 32 disks on a single adapter in an IBM i environment.

    Coming back to "One detail that caught our attention was the output of the WRKDSKSTS command, where the "% Busy" column was always greater than 50%, which is quite unusual in our experience with the v7000, which very rarely exceeds 15%."

    In fact, if the partition launches 32 I/Os on each disk while the queue on the VIOS side is limited to 20, the disk occupancy rate on the IBM i side can only increase.



    ------------------------------
    Nicolas FRAYSSE
    ------------------------------



  • 19.  RE: IBM i performance with vSCSI external storage

    Posted Mon May 06, 2024 04:23 PM

    Hello, everyone!

    Updating some more results obtained after the valuable tips and links from each of you.

    We changed the queue_depth from 256 (recommended by PURE) to 32 (recommended by several of your tips and also IBM publications). Once changed, We do IPL on IBM i (actually, it was already off) and IPL on the two VIOSes as well (one at a time :)   ).

    We also changed the zoning in the SAN network, which had initially been done for only one of the ports of each VIOS, we changed it to 02 FC ports in each VIOS, now the traffic is passing through 04 FC ports in a balanced way, as indicated by the VIOS Performance Advisor below.

    However, even with these adjustments, the result did not achieve the performance expected in all-flash storage with vSCSI communication. Which isn't bad at all, since we have no problems with i/o contention in our current architecture (hybrid IBM storage with NPIV). 

    I also remember some tips for increasing the number of LUNs (disk arms) and decreasing their size, but as our main objective was to compare two storages with the same volume and quantity, we were satisfied.

    We also realized that by changing our FC cards from 8 Gbps to 16 Gbps or even 32 Gbps we could achieve better results with a vSCSI solution, but in 2014 we stopped using SSP (shared storage pools) with vSCSI and started using NPIV as our main "protocol" because we believe NPIV is a cleaner and more flexible solution, less pressure in VIOS workload.

    I really appreciate everyone's contribution!!!


    Have a great week!!!

    Best regards!



    ------------------------------
    Marcos D. Wille
    ------------------------------