AIX

AIX

Connect with fellow AIX users and experts to gain knowledge, share insights, and solve problems.

 View Only
Expand all | Collapse all

Server IBM Power P750 error - VRM card fault

  • 1.  Server IBM Power P750 error - VRM card fault

    Posted Mon December 28, 2015 12:00 PM

    Originally posted by: promes


    Hi everyone,

    I have a server IBM Power P750 8408-E8D with 4 CPU and 4 VRM card. When a VRM card fault, then the server down with the status "error". 

    Please help me to explain detail about it ? Or send me any documents explain about the using of VRM card ?

    Thank you very much,



  • 2.  Re: Server IBM Power P750 error - VRM card fault

    Posted Mon December 28, 2015 02:10 PM

    Originally posted by: luverofpeanuts


     

    VRM is a voltage regulator module.  If that is what has failed, then the VRM needs to be replaced before the server can be powered up again.   You'll need to call IBM to have them replace the failing VRM.  

     

    'm curious, did your HMC log any prior "predictive failure" warnings about the VRM?  Or did it just fail and take the server down?



  • 3.  Re: Server IBM Power P750 error - VRM card fault

    Posted Mon December 28, 2015 08:53 PM

    Originally posted by: promes


    Hi GWLeibfried,

    No warnings about the VRM. It fail and take the server down.

    The server have 4 VRM and 4 CPU. So if a VRM fail, server has still 3 VRM. Why the server is down ?

    If 1 VRM dedicate for 1 CPU, when 1 VRM fail, server has 3 CPU remain operating normal, but server is down. Can you explain it ? Or any documents explain about the operating of VRM card ?

    Thank you,



  • 4.  Re: Server IBM Power P750 error - VRM card fault

    Posted Tue December 29, 2015 07:14 AM

    Originally posted by: luverofpeanuts


     

    This has happened on 2 of my p780 systems, with many more VRMs and more CPUs.....but I've had warnings/predictive errors logged on the HMC so was able to plan a shutdown and have the VRM's replaced.  

     

    I think it would be best to get the explanation from IBM.  We were told the VRMs are not redundant like you expected...and if one VRM fails, it will take the system down hard.   

     

     



  • 5.  Re: Server IBM Power P750 error - VRM card fault

    Posted Wed January 13, 2016 04:08 AM

    Originally posted by: Kruso


    Hi guys,

     

    we faced the same problem with our p750 Express HW class namely 8408-E8D. Two out of six frames already experienced the same within last two months. No preventive HMC alerts were seen prior the full outage.

    IBM formally confirmed back to us, that this could be impacting whole p700 product line and that a new VRM reengineered component shall be available in Q2 of 2016.

     

    Kruso.



  • 6.  Re: Server IBM Power P750 error - VRM card fault

    Posted Fri May 13, 2016 03:22 PM

    Originally posted by: g0rdy


    Hello, what were the warnings that you saw?



  • 7.  Re: Server IBM Power P750 error - VRM card fault

    Posted Fri May 13, 2016 04:54 PM
      |   view attached

    Originally posted by: luverofpeanuts


     

    Actually, the HMC automatically opened a PMH and one of our guys got a call.  I think the events have been pruned off our HMCs.   

     

    In addition, there were 'sysplanar' errors showing up sporadically in the errpt reports on some of our AIX LPARs.    

     

    I'll attach an image of the Error/event log in ASM as well, for one of the frames.   



  • 8.  Re: Server IBM Power P750 error - VRM card fault

    Posted Tue May 24, 2016 10:05 PM

    Originally posted by: KTM200SX


    We have experienced this issue multiple times now on POWER7+ 32 Way 750 8408-E8D running IBM i the most recent was last Sunday May 22nd 2016!

    SRC 11002650

    The entire platform, all LPAR's hard fail, same issue has occurred on the same system twice in past 2 years - no warning signs what so ever!

    On both occasions we have needed to fly parts in and have replaced all VRM's on all CPU boards. Fortunately (good planning) we have replication in place to other like systems so we can quickly execute an unplanned Role Swap to resume services. 

    Are in the process of upgrading to E870 class POWER8 systems - hopeful these will not have this reliability issue.  

     



  • 9.  Re: Server IBM Power P750 error - VRM card fault

    Posted Tue October 25, 2016 09:48 AM

    Originally posted by: FPLAZAVI


    Hello,

     

    We have the same problem 11002700 VRM fault on the processor card.

     

    Three times this month!!! on different server models ... and differents locations (CPD) ...

     

    740 was DOWN at all, other 770 we only received the advice 11002700 and other 770 loses 12x loop ... IBM say us is a problem with one VRM serie ...