AIX

AIX

Connect with fellow AIX users and experts to gain knowledge, share insights, and solve problems.


#Power
#Power
 View Only
Expand all | Collapse all

Error 554 with DELL (EMC) SAN Storage boot disk.

  • 1.  Error 554 with DELL (EMC) SAN Storage boot disk.

    Posted Fri April 14, 2023 01:46 PM

    Hi experts.. 

    We have an issue here at AIX booting time with EMC (Dell) SAN Storage System due path lost.
    After seeing Kernel Starting message, suddenly stops at 554 LED code because booting process can't find the path from it had been accessing the IPL disk.

    EMC told us we should reduce to just one path only from Storage to Server to avoid  use other path which the Storage System is not offer the active path to these boot disk. 
    But doing that, is a problem and represent a task which takes SysAdmin time and could incur in risks by human errors every time an LPAR needs to be booted; like unmap virtual FibreChannel adapters or unconfigure SAN Switches zoning or delete WWPN from the host attachment config at Storage System.

    Does somebody know how to avoid this problem happen? 
    We thought that maybe a vscsi disk from VIOS to LPARs could help, but, I can't believe every customer with EMC Systems should live with that problem. 

    Regards 






    ------------------------------
    Luis Alberto Rojas Kramer
    ------------------------------


  • 2.  RE: Error 554 with DELL (EMC) SAN Storage boot disk.

    Posted Fri April 14, 2023 01:53 PM
    On Fri, Apr 14, 2023 at 05:45:32PM +0000, Luis Alberto Rojas Kramer via IBM Community wrote:
    > Does somebody know how to avoid this problem happen?

    I do not recommend boot from SAN with anything other than IBM storage
    with a native AIX driver (ie: MPIO in base OS).

    I've had very poor experiences booting from SAN using PowerPath in the
    past, and EMC storage. Typically it took a mksysb restore to new LUNs
    to fix. Migratepv of rootvg wasn't bootable.

    > We thought that maybe a vscsi disk from VIOS to LPARs could help,
    > but, I can't believe every customer with EMC Systems should live
    > with that problem.

    In the case of a customer environment with a non-IBM SAN where boot
    from SAN is desired, VSCSI is the proper solution. The SCSI drivers
    are native to AIX, and the VIO server can manage the SAN storage and
    paths.

    It's not as complex as it sounds. Give your LPAR two VSCSI disks for
    boot via SAN LUNs mapped through VIO, and then give the LPAR NPIV
    adapters for data LUNs. The newer HMC GUI makes mapping VSCSI disks
    easy.

    You want two boot LUNs so that you have a second drive for
    alt_disk_copy and maintenance. Not for mirroring. Remember to set the
    queue depth on the client to match the SAN LUN.

    That also means you boot VIO locally, and not from SAN.

    Good luck.


    ------------------------------------------------------------------
    Russell Adams Russell.Adams@AdamsSystems.nl
    Principal Consultant Adams Systems Consultancy
    https://adamssystems.nl/




  • 3.  RE: Error 554 with DELL (EMC) SAN Storage boot disk.

    Posted Mon April 17, 2023 02:01 AM

    Hi Luis

    It may be best to open an AIX support case for this one if you have not already tried that avenue.  The solution to intermittent LED 554 issues can involve tweaks on switches or storage but will likely need in-depth investigation starting with AIX support. (I used to work in AIX support handling boot and storage device driver issues).



    ------------------------------
    Chris Wickremasinghe
    IBM
    ------------------------------



  • 4.  RE: Error 554 with DELL (EMC) SAN Storage boot disk.

    Posted Tue April 18, 2023 04:12 AM

    I fully agree. Our customers have no problems with AIX native MPIO booting from DellEMC or Hitachi (i.e. non-IBM) SAN storage (high-end models) via NPIV.
    Open IBM case and urge solution.

    Regards Igor.



    ------------------------------
    Igor Novotny
    Principal Consultant
    MHM Computer, a.s.
    Prague 15
    00420602369375
    ------------------------------



  • 5.  RE: Error 554 with DELL (EMC) SAN Storage boot disk.

    Posted Tue April 18, 2023 07:55 AM
    Edited by Henrik Morsing Tue April 18, 2023 08:07 AM
    Hi Luis,
     
    There is only one solution to your problem, send the Dell storage back. And please don't waste IBM's time opening a ticket, this is NOT an AIX issue.
     
    That being said, Dell does have a "hot-fix" that almost fixes the problem, but it's buried deep in a drawer somewhere and Dell's support people don't know about it. Tell your Dell account manager to get the hot-fix, if you have no option of sending the storage back.
     
    The hot-fix is: 3.2.0.1 (Hotfix, Build 1931125, 2023-01-27 20:38:38, Retail)

    Regards,
    Henrik Morsing



    ------------------------------
    Henrik Morsing
    ------------------------------



  • 6.  RE: Error 554 with DELL (EMC) SAN Storage boot disk.

    Posted Tue April 18, 2023 09:29 AM

    it could be file system corruption  There is documentation on how to do fsck in sms or maintenance mode.- in SMS boot do you see your disk on EMC.



    ------------------------------
    minesh patel
    ------------------------------



  • 7.  RE: Error 554 with DELL (EMC) SAN Storage boot disk.

    Posted Wed April 19, 2023 03:46 AM

    Hello,

    I don't know which EMC storage we are talking about.
    I had a problem with EMC storage, it did not see LUNs on AIX via mpio, so I had to install EMC PowerPath.
    Best to check with EMC.
    It might be best to create an SSP.

    Regards,



    ------------------------------
    Bratislav Petkovic
    ------------------------------



  • 8.  RE: Error 554 with DELL (EMC) SAN Storage boot disk.

    Posted Wed April 19, 2023 03:54 AM

    Hi Bratislav,

    This is a different issue. 554 on boot with Dell storage caused by a bug in the Dell storage system.

    But yes, would be good to get the exact model confirmed.

    Regards,
    Henrik Morsing



    ------------------------------
    Henrik Morsing
    ------------------------------



  • 9.  RE: Error 554 with DELL (EMC) SAN Storage boot disk.

    Posted Fri April 21, 2023 12:07 PM

    Luis
    Have a good day!

    The Dell/EMC claims that is not necessary PowerPath in to AIX/VIOS. The native driver AIXPCM must be enough.
    In the real life is not true.

    There is two reasons:
    1. Marketing - The Multipath driver for IBM Storage Systems is included now in the AIX/VIOS. The Dell/EMC multipath is PowerPath, but is very expensive. For this reason, they try to avoid it, because they are faceoff with Customers, that your real cost is higher that a solution with IBM Storage.
    2. Technical - You really need the PowerPath to avoid issues. Is no other way. Maybe works "sometimes" but is unstable solution.
    You can tune up the ODM and devices, but always is unstable solution. You will have a big problem with your Customer and the Dell/EMC fine (and tell to Customer with my products there is no issue:use Oracle Linux, OLVM etc...)

    Best Regards



    ------------------------------
    Humberto Sosa
    ------------------------------



  • 10.  RE: Error 554 with DELL (EMC) SAN Storage boot disk.

    Posted Wed April 26, 2023 06:38 AM

    Forgot to mention, booting (after hotfix) is more reliable if you set the boot paths to be the preferred paths from lsmpio.

    So 'lsmpio -l hdisk4':

    hdisk4   0        Enabled  Sel,Opt      fscsi0  58ccf0984d201f37,0
    hdisk4   1        Enabled  Non          fscsi0  58ccf0904d201f37,0
    hdisk4   2        Enabled  Sel,Opt      fscsi1  58ccf0984d211f37,0
    hdisk4   3        Enabled  Non          fscsi1  58ccf0904d211f37,0

    Then set the boot list:

    # bootlist -m normal hdisk4 pathid=0,2,1,3

    Regards,
    Henrik Morsing



    ------------------------------
    Henrik Morsing
    ------------------------------



  • 11.  RE: Error 554 with DELL (EMC) SAN Storage boot disk.

    Posted Wed April 26, 2023 03:20 PM

    Thanks you Henrik and everyone who took some time to answer my note

     

    Regards

     

    Luis Rojas

     






  • 12.  RE: Error 554 with DELL (EMC) SAN Storage boot disk.

    Posted 25 days ago

    Dears

    This problem seems to happen when the Powerpath filesets are installed when multiple paths are enabled. So configuration is not completed properly.

    Disabling all paths except one gives powerpath drivers a chance to configure.

    For a VIOS client, boot into maintenance mode with filesystems mounted. Ensure your EMC configuration is enabled with pprootdev on, pprootdev fix. Then from the VIOS servers, disable all paths except one path on one vios with the vfcmap command.  On the client you can try any command and will find that rootvg is read only. Re-enable all paths on all VIOS with vfcmap. Restart the client. Powerpath should populate the bootlist output with the bootable devices and the client will boot. When you run powermt display dev=all you may now see the problematic device path as failed and defined and can use powermt to cleanup.

    Regards



    ------------------------------
    Ian Bellinfantie
    ------------------------------



  • 13.  RE: Error 554 with DELL (EMC) SAN Storage boot disk.

    Posted 21 days ago

    Hi,

    LED 554 at "Kernel Starting" indicates a failure during the varyon of rootvg, and in environments using EMC/Dell SAN this is commonly related to multipath behavior during early boot rather than a requirement to operate with a single path.

    In most cases, the issue occurs because AIX may select a path that is visible but not active/optimized from the storage side .

    During the early boot phase, MPIO failover is limited, which can lead to rootvg not being accessible and the system stopping at LED 554.

    Reducing to a single path is not considered a best practice, as it removes redundancy and introduces operational risk.

    Recommended approach:

    • Ensure all paths to the boot LUN are consistently presented and correctly optimized on the storage side configuration.
    • Verify AIX MPIO configuration:
      • reserve_policy=no_reserve
      • Appropriate path selection algorithm (e.g. round_robin or fail_over depending on array type)
      • Confirm all paths are in Enabled state (lspath) with no missing or failed paths.
      • Rebuild boot image after any changes (bosboot, bootlist).
      • If applicable, consider using EMC PowerPath or the correct AIX PCM for the array.

    As a practical and commonly used, some environments choose to present rootvg via vSCSI from VIOS, allowing VIOS to handle multipathing while keeping NPIV for data disks. This avoids early boot path selection issues.

    The correct resolution is to align SAN presentation + AIX MPIO configuration or use vSCSI for rootvg if needed for stability.



    ------------------------------
    Anas AlSaleh
    IBM Power Systems Software Specialist
    Saudi Business Machines ( SBM )
    Riyadh
    ------------------------------



  • 14.  RE: Error 554 with DELL (EMC) SAN Storage boot disk.

    Posted 21 days ago

    In this scenario PowerPath was already installed as the preferred multipathing software. There were only 2 FC ports for NPIV available on each VIOS for the 4 paths to the client, so no spare FC ports to use as vSCSI ports available. Reducing to a single port and immediately re-enabling all FC ports using vfcmap worked in this case to quickly enable the lpar to boot with powerpath without any device driver changes. In addition, PowerPath can be replaced with EMC.Symmetrix.MPIO.rte to enable the lpar to use the native MPIO and so retain the use of NPIV for all LUNs, simplifying the configuration.



    ------------------------------
    Ian Bellinfantie
    ------------------------------



  • 15.  RE: Error 554 with DELL (EMC) SAN Storage boot disk.

    Posted 20 days ago

    Hello Luis,

    I have been working with EMC storage and IBM AIX for some time in the past and I remember that led 554 on boot at one time was very likely. We got a fix (I will have to check my e-mail and see whether we fixed it ourselves, got a fix from IBM or from EMC). We also noticed that we often could not create an mksysb (since AIX couldn't determine the disk it booted from) - we fixed that by replacing 'pprootdev' (a Powerpath script from EMC) with our own version.

    I will check my e-mail and get back here with some recommendations. Also, some of the suggestions here, sound pretty good to me.



    ------------------------------
    Richard Westerik
    Principal specialist
    Simac IT NL bv
    Ede
    +31651575123
    ------------------------------



  • 16.  RE: Error 554 with DELL (EMC) SAN Storage boot disk.

    Posted 17 days ago

    I seem to have better memory than I thought. Worked with EMC storage (on AIX) from 2004 to 2014. Some 554 issues were fixed by EMC (because of bugs in new versions of their software), some we were able to create a workaround for (like mksysb not always working because of Powerpath).

    What we did is create a script running at boot-time and at shutdown, running basically "pprootdev on" and "pprootdev fix". And we used a modified version of 'pprootdev' that reorders the devices in AIX's CuAt ODM database, so the disk the system was actually booted from appears first - fixing our mksysb problem. I have submitted my version to EMC support, but as far as I know, they haven't changed it during the time I worked with EMC storage.

    We found the issue to be intermittent, occurring only occasionally. If fairly persistent, then EMC suggested turning on AIX's low-level debugger and rebooting the system, to catch more diagnostic output so EMC (or IBM) could work with it; see instructions below if you would want to try that.

    I have always felt that EMC made a weird decision in combination with AIX. On AIX any multipathing is typically done hierarchically "below" the level of the hdisk (several other items, like pdisks making up an hdisk). EMC chose to put something "on top of" hdisks that it has to manage on its own, having difficulty making AIX fully aware of it.

    I hope you find a workable solution soon; my knowledge is probably too old to be useful nowadays.

    If you want to invoke the IBM low-level debugger at next boot:

    # bosboot -ad /dev/ipldevice -I

    When AIX boots, the low-level debugger pauses the system with a message "Welcome to KDB" and a prompt

     ************* Welcome to KDB *************
    Call gimmeabreak...
    Static breakpoint:
    .gimmeabreak+000000             tweq    r8,r8               r8=0
    .gimmeabreak+000004              blr                        <.kdb_init@AF91_62+0001F4> r3=0
    KDB(0)> mw enter_dbg
    enter_dbg+000000:  00000000  = 42
    enter_dbg+000004:  00000000  = .

    KDB(0)> g

    At the KDB(0) prompt you enter "mw enter_dbg".
    At the next prompt you enter the number 42 and press Enter.
    At the next prompt you enter a period (.) and press Enter.
    At the new KDB(0) prompt you enter "g" (for go) and press Enter.

    If the system boots and you want to "undo" the debugging, use bosboot again, without the -I option and reboot the system.

    # bosboot -ad /dev/ipldevice



    ------------------------------
    Richard Westerik
    Principal specialist
    Simac IT NL bv
    Ede
    +31651575123
    ------------------------------