Primary Storage

 View Only
Expand all | Collapse all

FS7300 FC Connectivity Issues After Upgrades

  • 1.  FS7300 FC Connectivity Issues After Upgrades

    Posted Mon December 04, 2023 12:19 PM

    Hi,

    We performed an upgrade of one of our two FS7300's over the weekend from 8.5.0.5 to 8.5.0.9, an upgrade finished at approx 10:30 in the morning Saturday. The FS7300 is one of two FS7300's, both sitting behind a 6 node SVC cluster. We came in on Monday to find we have three "Number of device logins reduced" error messages in the event log of the FS7300, which happened in the early hours of Sunday, some 14.5 hours post upgrade. Strange we thought, we don't get those at the backend storage, but anyway, lets go through the fix procedure to re-scan and hopefully clear the errors. Unfortunately none of three cleared, they all fail with:

    The disk discovery succeed but failed to find storage system n. The configuration might still be valid

    So on further investigation, using the lsfabric CLI command, and the equivalent output in the "Fibre Channel Connectivity" GUI screens, we can see 3 of the SVC host nodes are reporting 4 out of 24 FC connections reporting as Type "Storage System", and not as "Host", we would expect for an FS7x00 being behind an SVC all 24 as a Host. The other 3 SVC nodes are reporting fine with all 24 x FC Connectivity Type of Host.

    This got us thinking, we upgraded our other FS7300 to 8.5.0.9 previous to this one, but with no "...device login.." errors, how does it's FC Connectivity look, and we found it also had only one SVC host node with 4 out of 24 FC connections showing as a Type of Storage System, and no Host. 

    From the SVC end, we have no errors, we have full paths in place to all the backend storage systems as expected, whether that be the FS7300's with issues, or the 7200's, or v5k. The SVC is showing all FC Connectivity to Storage Systems as Type "Storage System".

    So we have an SVC showing storage as storage, one backend FS7300 showing 12 x FC Host node connections as Storage System, one FS7300 showing 4 x FC SVC host node as Storage System, and one of the FS7300 having "....device login...." errors. 

    So where do we go from here, how do we get back to a full backend Host FC connection for all SVC nodes, and how do we clear the "....device login...." errors on the FS7300, anyone have any ideas? 

    Shout if you need any further info and we'll get back to you, or if you recommend a support case raised we'll do that.

    Cheers, Andy



    ------------------------------
    Andy 91717
    ------------------------------


  • 2.  RE: FS7300 FC Connectivity Issues After Upgrades

    Posted Tue December 05, 2023 03:10 AM

    First thing I would check is SAN zoning : for SVC-Storage zoning

    • Only use the "physical" WWN's of the SVC nodes (do not use the NPIV ones)
    • Only use the NPIV WWN of the FS7300 storage (I think it is the 2nd out of 3 WWN's per fiber port)


    ------------------------------
    Hans Populaire
    ------------------------------



  • 3.  RE: FS7300 FC Connectivity Issues After Upgrades

    Posted Tue December 05, 2023 09:46 AM

    Hi Hans,

    Thanks for your reply. As mentioned in my first post, we have FS7200s & FS7300s attached to our 6 node SVC cluster. When we look at the configuration of the SVC & FS7x00, they do differ as we implemented them over a number of years:

    • SVC - NPIV is Disabled, has been since day one many years ago, from old DH8 -> SV1 -> SV2, so we only have physical ports available, we never enabled NPIV as it became prevalent.
    • FS7200 - NPIV is Enabled - and the NPIV ports are zoned to the SVC physical ports
    • FS7300 - NPIV is Disabled - so physical ports are zoned to the SVC physical ports

    Whilst that is a mis-mash of connections, and from yours and Patrik's comments, not best practice, that is the way it has been for many years of SVC nodes, and in the last three years we got the 7200's and 7300's installed.

    We are fairly sure, but yet proven, for when we put in the FS7300 we found a best practice that we interpreted to say back-end storage should have NPIV disabled and physical ports should be used when presented to an SVC only which has physical ports also, and in our in-house install guide with screenshots, we deliberately had a section to disable NPIV at an I/O group level.

    Due to the secure nature of our account we don't upload logs to any suppliers, which does make our jobs very very difficult. We do have IBM account managers we can reach out to and get some specialised support as required. So I am going to extend out to them to ask for advice and guidance as to what may be causing our random Storage System Types for some of the SVC node ports on the backend FS7300s, and why does one data hall have only 4 Storage Systems, and the other has 12 Storage Systems, but the SVC has the same number of active connections. 

    We'll update this ticket when we get some response over the coming days. 

    Cheers, Andy



    ------------------------------
    Andy 91717
    ------------------------------



  • 4.  RE: FS7300 FC Connectivity Issues After Upgrades

    Posted Tue December 05, 2023 09:53 AM
    Edited by Hans Populaire Tue December 05, 2023 11:28 AM





  • 5.  RE: FS7300 FC Connectivity Issues After Upgrades

    IBM Champion
    Posted Tue December 05, 2023 03:40 AM
    First open a ticket, which is priority 1 for such an error. Then upload the snap, level 4. If you are using Storage Insights, which I recommend, this is very easy and since you apparently have maintenance for the systems, this is also included free of charge. As Hans wrote, check the zoning afterwards. 
    There is no information on this:
     
    What kind of SVC cluster is this? 6 nodes
    Is it an Enhanced with Site Awareness? 
    How many FC ports per node?
    Which nodes on which side see the storage completely and which only partially or not at all. 
     
    DC1 ------ DC2
    Node 1 - Node 2
    Node 3 - Node 4
    Node 5 - Node 6
    FS73_1 - FS73_2
     
    There should be three types of zones:
    1. SVC intracluster zones
    2. storage - SVC zones
    3. host zones
     
    You can check these first and if they are correct, you can continue.
     
    And importantly, I hope there is a working backup. At the moment the environment is running without redundancy.
     


    ------------------------------
    Patrik Groß
    ------------------------------



  • 6.  RE: FS7300 FC Connectivity Issues After Upgrades

    Posted Tue December 05, 2023 10:08 AM

    Hi Patrik,

    Thanks for your reply. As mentioned above to Hans, raising tickets and sending logs are not an option for us, which does make our jobs very very difficult! 

    Just to answer some of your questions

    • Correct we are a 6 node SVC cluster
    • We have two data halls, one node of each IOgroup in each data hall
    • This is a basic stretched cluster, it it not enhanced or site aware
    • Each SVC node has 8 ports connected, 2 used for private SAN, 6 used for shared host & storage connections
    • As per your picture of DC1/DC2, we do have one FS7300 in each data hall
    • Every SVC node has full connectivity to the FS7300's in each data hall, from the SVC GUI, it has exactly the same number of FC Connectivity items
    • From the FS7300 in Data Hall 1, it has a total of 140 Host types, and 4 Storage System type in FC Connectivity screen. 
    • From the FS7300 in Data Hall 2, it has a total of 132 Host Types, and 12 Storage System type in FC Connectivity screen.
    • The Storage System type in FC Connectivity are random, not on the same SVC ports, between the FS7300's
    • Oh and we do have the three types of zones, a "failsafe" zone for the storage, SVC to Storage, and host to SVC. 

    As above, we'll get to our account management team and ask their advice also, and see what they say. 

    I do wish I could raise a case and send logs, I really do, but my hands are tied. 

    Cheers, Andy



    ------------------------------
    Andy 91717
    ------------------------------



  • 7.  RE: FS7300 FC Connectivity Issues After Upgrades

    Posted Wed December 06, 2023 04:40 AM

    Hi Andy,

    speaking for IBM RTS (Remote Technical Support) team I can say, without having a support package (alias snap) to analyze, it'll be hard if not even impossible to deliver a solution for this problem.

    While there are special IBM processes in place for sensitive accounts, go ahead through your account managers so you can get the help you need.

    One technical tip though: occasionally we've seen SVC in the past showing up as storage device instead as host in a back-end storage. One of the workarounds applied was to toggle the SAN switch FC ports facing the SVC respectively the FlashSystem.

    Best regards, 

    Christian Schroeder



    ------------------------------
    Best regards, 

    Christian Schroeder
    IBM Storage Virtualize Support with Passion
    ------------------------------



  • 8.  RE: FS7300 FC Connectivity Issues After Upgrades

    Posted Wed December 06, 2023 08:21 AM

    Hi Christian,

    Many thanks for the reply. Agree 100% on the issues around not sending logs, it does make IBM's, and other vendors we interact with, job very difficult to help. Whilst we have the ability to download and unpack logs locally, obviously if they are in a encrypted format we're at an impass, but one we can hopefully overcome with our account team. 

    As for a disable/enable of the SAN ports, we did try this with just one SAN port as a test. We disabled an FS7300 SAN port, the Storage System entry disappeared from the FC Connectivity screen. We enabled the port again, and it came back as.....a Storage System, not a Host :-(

    Is it worth a disable/enable of all the ports, one at a time of course, to try and help clear the issue?

    Thanks,

    Andy



    ------------------------------
    Andy 91717
    ------------------------------



  • 9.  RE: FS7300 FC Connectivity Issues After Upgrades

    Posted Wed December 06, 2023 08:54 AM

    Hi Andy,

    thanks for your frank revert.

    Since it's a bit unclear, whether SVC or FS7300 are misbehaving, I'd suggest to toggle an SVC-facing SAN port.

    In fact, in the past we had to advise customers to toggle all SVC SAN ports, one at a time to force a re-login, which eventually did clear this weird condition.

    If you happen to be in touch with another IBM support rep through your IBM account team, feel free to DM me so we can get in touch with each other if need be.

    All the best,

    Christian



    ------------------------------
    Best regards, 

    Christian Schroeder
    IBM Storage Virtualize Support with Passion
    ------------------------------



  • 10.  RE: FS7300 FC Connectivity Issues After Upgrades

    Posted Wed January 10, 2024 12:22 PM

    Hi Christian,

    To give a successful conclusion to our issues, I can confirm that by recycling the 4 SVC ports which were reporting issues to the backend FS7300s, this did indeed fix the FS7300 FC Type, setting it back to "Host".

    Given Xmas and New Year change freeze we were only able to complete the task this evening. We disabled the SVC port, the Type of "Storage System" disappeared from the FS7300 FC hosts list, and when we reenabled the SVC port, the FC Type of Host returned. We did one port at a time, clearing the SVC port login errors inbetween each port recycle, and everything came back working as designed. 

    As you said it just needed a poke to login in again, and the FS7300's are reporting all ports correctly.

    Thanks to everyone for their help and advice. Now we have got this issue solved we can plan in our SVC upgrade, hopefully that will go smoother than our FS7300 ones did ;-)

    Cheers,

    Andy



    ------------------------------
    Andy 91717
    ------------------------------