Global Storage

 View Only
Expand all | Collapse all

V7000 - Host Degraded, Port Active

  • 1.  V7000 - Host Degraded, Port Active

    Posted Thu February 08, 2024 10:32 AM
    Edited by Andres Parada Fri February 09, 2024 11:19 AM

    Hi Everybody!

    We have encountered an error with our Storwize V7000.

    Our setup has 6 nodes, with two Brocade SAN SW-es (DS_6505B).
    (Earlier we had 4 nodes connected directly to the V7000 via 8 FC cables, so 2 nodes and the SAN SW-es are just installed)

    5 nodes are working properly, but NODE-2 is shown Degraded at the hosts page, meanwhile in the Properties >> Port Definitons page everything is Active.
    Here is the CLI example about the problem:

    IBM_Storwize:STORAGE-5:superuser>lshost
    id name      port_count iogrp_count status   site_id site_name host_cluster_id host_cluster_name protocol owner_id owner_name
    0  NODE-1     2          4           online                     0               CLUSTER1      scsi
    1  NODE-2     2          4           degraded                   0               CLUSTER1      scsi
    2  SERVER-9   2          4           online                     1               CLUSTER2      scsi
    3  SERVER-10  2          4           online                     1               CLUSTER2      scsi
    4  NODE-3     2          4           online                     2               CLUSTER3      scsi
    5  NODE-4     2          4           online                     2               CLUSTER3      scsi
    
    IBM_Storwize:STORAGE-5:superuser>lshost 1
    id 1
    name NODE-2
    port_count 2
    type generic
    mask 1111111111111111111111111111111111111111111111111111111111111111
    iogrp_count 4
    status degraded
    site_id
    site_name
    host_cluster_id 0
    host_cluster_name CLUSTER1
    protocol scsi
    status_policy redundant
    status_site all
    WWPN 1000001*********
    node_logged_in_count 2
    state active
    WWPN 1000001*********
    node_logged_in_count 2
    state active
    owner_id
    owner_name
    


    The SANs are configured with the same commands. 
    SAN-1 has ST5-CAN1/1, ST5-CAN1/3, ST5-CAN2/1, ST5-CAN2/3 connected.
    SAN-2 has ST5-CAN1/2, ST5-CAN1/4, ST5-CAN2/2, ST5-CAN2/4 connected.

    TVT_STORAGE-5_PORT3 for example includes CAN1/3 & CAN2/3 WWPNs (Virtualized and physical), and the zones are for this node are the following:

     zone:  Z-TVT-NODE-2_HBA1_PORT0-TVT_STORAGE-5_PORT3
                    TVT_NODE-2_HBA1_PORT0; TVT_STORAGE-5_PORT3
     zone:  Z-TVT-NODE-2_HBA2_PORT0-TVT_STORAGE-5_PORT4
                    TVT_NODE-2_HBA2_PORT0; TVT_STORAGE-5_PORT4

    fcping is working properly I think (TVT_NODE-2_HBA1_PORT0 is connected to SAN-1 & TVT_NODE-2_HBA2_PORT0 is connected to SAN-2):

    SAN-1:admin> fcping 10:00:00:1*:**:**:**:** (TVT_NODE-2_HBA1_PORT0)
    Destination:    10:00:00:1*:**:**:**:**
    
    Pinging 10:00:00:1*:**:**:**:** [0x010300] with 12 bytes of data:
    received reply from 10:00:00:1*:**:**:**:**: 12 bytes time:722 usec
    received reply from 10:00:00:1*:**:**:**:**: 12 bytes time:695 usec
    received reply from 10:00:00:1*:**:**:**:**: 12 bytes time:628 usec
    received reply from 10:00:00:1*:**:**:**:**: 12 bytes time:720 usec
    received reply from 10:00:00:1*:**:**:**:**: 12 bytes time:660 usec
    5 frames sent, 5 frames received, 0 frames rejected, 0 frames timeout
    Round-trip min/avg/max = 628/685/722 usec
    
    SAN-1:admin> fcping 10:00:00:1*:**:**:**:** (TVT_NODE-2_HBA2_PORT0)
    fcping: Error destination wwn invalid
    SAN-2:admin> fcping 10:00:00:1*:**:**:**:** (TVT_NODE-2_HBA1_PORT0)
    fcping: Error destination wwn invalid
    
    SAN-2:admin> fcping 10:00:00:1*:**:**:**:** (TVT_NODE-2_HBA2_PORT0)
    Destination:    10:00:00:1*:**:**:**:**
    
    Pinging 10:00:00:1*:**:**:**:** [0x010300] with 12 bytes of data:
    received reply from 10:00:00:1*:**:**:**:**: 12 bytes time:660 usec
    received reply from 10:00:00:1*:**:**:**:**: 12 bytes time:697 usec
    received reply from 10:00:00:1*:**:**:**:**: 12 bytes time:625 usec
    received reply from 10:00:00:1*:**:**:**:**: 12 bytes time:717 usec
    received reply from 10:00:00:1*:**:**:**:**: 12 bytes time:627 usec
    5 frames sent, 5 frames received, 0 frames rejected, 0 frames timeout
    Round-trip min/avg/max = 625/665/717 usec

    NODE-2 is restarted and updated. What shall we check next? Would it be a Storage, SAN, or NODE missconfiguration?

    Thx for the advices!



    ------------------------------
    Krisztián Révész
    ------------------------------



  • 2.  RE: V7000 - Host Degraded, Port Active

    Posted Mon February 12, 2024 10:09 PM

    Hello Krisztián.
      Firstly, what is the Model of your V7000 and what code level is it running?

    With the redacted information, there really is not enough information to help you.
    while the lshost output looks as though things are all good......

    WWPN 1000001*********
    node_logged_in_count 2
    state active
    WWPN 1000001*********
    node_logged_in_count 2
    state active

    where each Host 'NODE2' HBA port is logged into each node of the V7000.
     however there is a need for the actual Host WWPNs (according to the actual  Host )

    Your Zoning you provide gives 'alias' names, so there might be something incorrect with the alias definitions.

    In older code levels there has been a bug where a host may display degraded incorrectly.
     but without all the pieces I cannot complete your puzzle.



    ------------------------------
    GLEN ROUTLEY
    ------------------------------



  • 3.  RE: V7000 - Host Degraded, Port Active

    Posted Tue February 13, 2024 01:43 PM

    Hi Glen,

    Thanks for your reply!
    We found the solution while we were waiting for approval for this post. 😀

    The problem came from the alias of the V7000. We made a mistake, with adding the physical WWPN too. After we removed it, and only the virtual WWPNs left, every NODE went back, to Online & Active. 



    ------------------------------
    Krisztián Révész
    ------------------------------