Primary Storage

 View Only
  • 1.  DS3524 not responsive

    Posted Mon November 27, 2023 02:11 PM

    Hello. I have one DS3524 with one controller connected to server via LSI SAS2 adapter.

    Some weeks ago the link between ds3524 and server is blinked. After some time link is restored.

    One day ago link gone ) I do not see disk in windows, in ds storage manager i see ds3524 in status of out-of-band and after some minutes in unresponsive state. ping to controller is ok. I can connect to controller via telnet. smcli -d -v command show me ip addresses of controller and state Unresponsive.

    I tried to switch of-on ds3524 - no link

    Is it possible to reanimate ds3524?

    Greate Thanks!



    ------------------------------
    Andrew M
    ------------------------------


  • 2.  RE: DS3524 not responsive

    Posted Tue November 28, 2023 11:02 AM
    Edited by Andres Parada Wed November 29, 2023 05:03 PM

    Hello Andrew, 

    given the fact that CTL is currently unresponsive, we need to know first the LED status on this CTRL which can be seen on rear side.  Besides to that please try to connect via telnet to the  CTRL and run the following commands:
    vdmShowDriveList
    evfShowOwnership
    rdacMgrShow
    cmgrShow
    evfShowAllVols
    excLogShow

    As soon as i get the results, will check and try to assist you .


    Best regards; Mousa



    ------------------------------
    Mousa Hammad
    ------------------------------



  • 3.  RE: DS3524 not responsive

    Posted Fri December 01, 2023 01:43 AM

    Thanks for answer!

    all listed commands are unknown on controller.

    only excLogShow works

    log is

    ---- Log Entry #11 APR-27-2018 12:31:49 PM ----
    04/27/18-17:57:49 (IOSymbol2): PANIC: Invalid response sense data:0x110e4010 or
    replyMessage:0x0

    Stack Trace for
    Executing moduleShow(0,0,0,0,0,0,0,0,0,0) on controller A:

    MODULE NAME     MODULE ID  GROUP #    TEXT START DATA START  BSS START
    --------------- ---------- ---------- ---------- ---------- ----------
    RAID              0xebf788          3  0x5f26a60  0x80f4b08  0x81652d0
    RAID1            0x1477658          4  0x1477f20  0x1bc4408  0x1bdef78
    Debug            0x1ea44e0          5  0x2306620  0x24b24a0  0x24b5c38
    IOSymbol2:
    0x0026092c vxTaskEntry  +0x5c : vkiTask (0x11000468)
    0x0017152c vkiTask      +0xec : 0x05f7d6e4 ()
    0x05f7d880 iop::IoScheduleManager::srcOpTask(iop::IoScheduleManager::TaskControl
     *, scsi::Op *+0x1a0: cmd::CmdManager::process(scsi::Op *) ()
    0x01702c54 cmd::CmdManager::process(scsi::Op *)+0xf4 : 0x01a70c20 ()
    0x01a70c94 Thunk for (offset -4) ql::QlManager::~QlManager()+0x9634: 0x06994904
    ()
    0x06994948 symrpc::SymbolManager::utmCmdHandler(scsi::Op *)+0x48 : symrpc::UtmSe
    rvice::handleCommand(scsi::Op *) ()
    0x069af038 symrpc::UtmService::handleCommand(scsi::Op *)+0x3f8: slbSendStatus ()
    0x05fd2a80 slbSendStatus+0x140: 0x05fda7e4 ()
    0x05fda980 normalIoStart+0x1a0: setChkCondOrResConflict(scsi::Op *) ()
    0x05fdcea4 setChkCondOrResConflict(scsi::Op *)+0x44 : htd::HtdItnCmdIoStart(scsi
    ::Op *) ()
    0x05fc630c htd::HtdItnCmdIoStart(scsi::Op *)+0x4cc: 0x06051dc4 ()
    0x06051df0 sas::LtdItn::sendCmdComplete(scsi::Op *)+0x30 : sas::sasIoInSendStatu
    s(sas::_CMD *, unsigned char *, int, unsigned char) ()
    0x06061530 sas::sasIoInSendStatus(sas::_CMD *, unsigned char *, int, unsigned ch
    ar)+0x730: _vkiCmnErr__link ()
    0x0016c5e4 _vkiCmnErr   +0x104: 0x0016c820 (0x56a038, 0x7e00dc0, 0x21e07f0)
    0x0016cbd0 vkiLogShow   +0x570: sxCallback (0x28, 0x5cd33c)
    0x0015c790 sxCallback   +0x90 : 0x01488b44 ()
    0x01488be8 ddcAssertPanicCallback+0xa8 : ddc::DdcManager::ddcInterruptTriggerHan
    dler() ()
    0x01488f9c ddc::DdcManager::ddcInterruptTriggerHandler()+0x23c: ddc::DdcLogMisc:
    :logMisc(REBOOT_REASON) ()
    0x0148784c ddc::DdcLogMisc::logTaskSynopsisInfo(int)+0x12c: 0x014b5974 ()
    0x014b5974 scap::CaptureManager::captureData(const char *, int, bool)+0x6f4: _vk
    iPrintf__link ()
    0x0016abc4 _vkiPrintf   +0x64 : _vkiVPrintf (0x1af8aa4, 0x21e03d0)

    ---- Log Entry #12 NOV-21-2023 05:02:28 AM ----
    ERROR: Port 0 Bad TLP Count 1572864 exceeds threshold 16
    ERROR: Port 0 Bad DLLP Count 1073743408 exceeds threshold 16
    ERROR: Port 4 Bad TLP Count 1572864 exceeds threshold 16
    ERROR: Port 4 Bad DLLP Count 1073743408 exceeds threshold 16
    ERROR: Port 5 Bad TLP Count 1572864 exceeds threshold 16
    ERROR: Port 5 Bad DLLP Count 1073743408 exceeds threshold 16
    ERROR: Port 6 Bad TLP Count 1572864 exceeds threshold 16
    ERROR: Port 6 Bad DLLP Count 1073743408 exceeds threshold 16
    ERROR: Port 2/6 Rx Err Count 24 exceeds threshold 16

    ---- Log Entry #13 NOV-21-2023 05:02:28 AM ----
    ERROR: Type-I Port 0 ECC correctable error threshold exceeded reg 0xf1a val 0x18

    ---- Log Entry #14 NOV-22-2023 06:46:07 AM ----
    ERROR: Port 0 Bad TLP Count 1572864 exceeds threshold 16
    ERROR: Port 0 Bad DLLP Count 1073743408 exceeds threshold 16
    ERROR: Port 4 Bad TLP Count 1572864 exceeds threshold 16
    ERROR: Port 4 Bad DLLP Count 1073743408 exceeds threshold 16
    ERROR: Port 5 Bad TLP Count 1572864 exceeds threshold 16
    ERROR: Port 5 Bad DLLP Count 1073743408 exceeds threshold 16
    ERROR: Port 6 Bad TLP Count 1572864 exceeds threshold 16
    ERROR: Port 6 Bad DLLP Count 1073743408 exceeds threshold 16
    ERROR: Port 2/6 Rx Err Count 24 exceeds threshold 16

    ---- Log Entry #15 NOV-22-2023 06:46:07 AM ----
    ERROR: Type-I Port 0 ECC correctable error threshold exceeded reg 0xf1a val 0x18

    ---- Log Entry #16 NOV-22-2023 07:34:28 AM ----
    ERROR: Port 0 Bad TLP Count 1572864 exceeds threshold 16
    ERROR: Port 0 Bad DLLP Count 1073743408 exceeds threshold 16
    ERROR: Port 4 Bad TLP Count 1572864 exceeds threshold 16
    ERROR: Port 4 Bad DLLP Count 1073743408 exceeds threshold 16
    ERROR: Port 5 Bad TLP Count 1572864 exceeds threshold 16
    ERROR: Port 6 Bad TLP Count 1572864 exceeds threshold 16
    ERROR: Port 6 Bad DLLP Count 1073743408 exceeds threshold 16
    ERROR: Port 2/6 Rx Err Count 24 exceeds threshold 16

    ---- Log Entry #17 NOV-22-2023 07:34:29 AM ----
    ERROR: Type-I Port 0 ECC correctable error threshold exceeded reg 0xf1a val 0x18

    ---- Log Entry #18 NOV-22-2023 11:35:59 AM ----
    ERROR: Port 0 Bad TLP Count 1572864 exceeds threshold 16
    ERROR: Port 0 Bad DLLP Count 1073743408 exceeds threshold 16
    ERROR: Port 4 Bad TLP Count 1572864 exceeds threshold 16
    ERROR: Port 4 Bad DLLP Count 1073743408 exceeds threshold 16
    ERROR: Port 5 Bad TLP Count 1572864 exceeds threshold 16
    ERROR: Port 5 Bad DLLP Count 1073743408 exceeds threshold 16
    ERROR: Port 6 Bad TLP Count 1572864 exceeds threshold 16
    ERROR: Port 6 Bad DLLP Count 1073743408 exceeds threshold 16
    ERROR: Port 2/6 Rx Err Count 24 exceeds threshold 16

    ---- Log Entry #19 NOV-22-2023 11:35:59 AM ----
    ERROR: Type-I Port 0 ECC correctable error threshold exceeded reg 0xf1a val 0x18

    value = 1 = 0x1

    on disks all LED is off. Blink only when I release disk

    rear side look like on picture. Power supply on picture not connected to line. main power sypply is second PS!



    ------------------------------
    Andrew M
    ------------------------------



  • 4.  RE: DS3524 not responsive

    Posted Fri December 01, 2023 03:17 AM
    Edited by Mousa Hammad Fri December 01, 2023 05:15 AM

    Hello Andrew,
    the Command could not be run because the system did not finish teh startup sequence and stopped with 0F on the LED Display.
    LED status 0F means "Application Start".. this is part of the System Startup Checkpoints

    In the excLogShow i can see these messages "ECC correctable error threshold exceeded " reported on 21st and 22nd November
    Please try to run the following command to resolve the issue:
    clearHardwareLockdown


    Try to run this command in case accepted by system:
    ccmInvalidateCacheStoreData

    If the LED status still showing "0F", please check the the 'Autoload Disable' optin in the boot operation menu if it is set to Enable. This should be OFF.
    You can check/change that by accessing the Boot Opetaion menau by runing the comamnd "M". Then select these options 12, 7, 0 sequence in boot-menu to reach this option.
    We had a case long time ago for some unknown reason the 'Autoload Disable' was changed.
    Best regards, Mousa



    ------------------------------
    Mousa Hammad
    ------------------------------



  • 5.  RE: DS3524 not responsive