HMC

 View Only
Expand all | Collapse all

Where to find power supply status, hardware sensor data?

  • 1.  Where to find power supply status, hardware sensor data?

    Posted Fri November 12, 2021 09:21 AM
    I've asked this before in other places, but the topic came up again today. Where can I find the current power supply status, line in status, CPU temp, fan speed, and other sensor data common to a high end system?

    The HMC does a great job of notifying when a power leg is lost, calling home, setting alarm led, etc. However there's no way to tell when it's fixed.

    Back in AIX 4 we could use uesensor to get limited information, but that's no longer supported.

    machstat gives very poor information for rc.powerfail, so that's terribly incomplete.

    Where else can we look for this information?

    ------------------------------
    Russell Adams
    ------------------------------


  • 2.  RE: Where to find power supply status, hardware sensor data?

    Posted Wed November 24, 2021 11:27 AM

    Russell
    I don´t know if you check it this before. With HMC API, theorically is possible.
    Regards

    Below is the HMC Knowledge Center documentation that gets updated periodically to reflect the changes in HMC REST Interfaces.

    https://www.ibm.com/support/knowledgecenter/POWER8/p8ehl/concepts/ApiOverview.htm


    Below is the HMC API that provides Power Supplies and Fans related information along with Status of each of these devices.

    https://<hmcip>/rest/api/uom/ManagedSystem/<UUID>?group=None&hwinventory=true

    In order to see these fields (power supply and fans), the following levels are needed on the HMCs and / or managed servers:

    HMC - 8.6 SP2 with PTF MH01716

    Server firmware - 860_103



    ------------------------------
    Humberto Sosa
    ------------------------------



  • 3.  RE: Where to find power supply status, hardware sensor data?

    Posted Thu November 25, 2021 04:14 AM
    Do you have an example? I've looked through the API pages and it doesn't document anything about fans or power.

    I did see a python example for querying some data: https://www.ibm.com/support/pages/power8-watts-temp-ssp-io-serverlpar-stats-hmc-rest-api-version-10


    ------------------------------
    Russell Adams
    ------------------------------



  • 4.  RE: Where to find power supply status, hardware sensor data?

    Posted Fri November 26, 2021 03:54 AM
    Hi Russell,

    I haven't seen metrics for fan speeds in the HMC performance metrics data, but there are metrics for temperatures and power consumption.

    You can see an example of the output here:
    https://bitbucket.org/mnellemann/hmci/src/master/src/test/resources/pcm-data-energy.json


    Best regards,

    ------------------------------
    Mark Nellemann
    ------------------------------



  • 5.  RE: Where to find power supply status, hardware sensor data?

    Posted Fri November 26, 2021 04:28 AM
    On Fri, Nov 26, 2021 at 08:54:10AM +0000, Mark Nellemann via IBM Community wrote:
    > I haven't seen metrics for fan speeds in the HMC performance metrics
    > data, but there are metrics for temperatures and power consumption.
    >
    > You can see an example of the output here:
    > https://bitbucket.org/mnellemann/hmci/src/master/src/test/resources/pcm-data-energy.json

    This is a great example of temperature. I think temp and power
    consumption are addressed, but not power health.

    ------------------------------------------------------------------
    Russell Adams Russell.Adams@AdamsSystems.nl
    Principal Consultant Adams Systems Consultancy
    http://adamssystems.nl/




  • 6.  RE: Where to find power supply status, hardware sensor data?

    Posted Fri November 26, 2021 11:35 AM
    Hello Russell

    Definitely, the sensors exists but for some reason are hidden.

    Energy Monitoring
    Maybe in the RAW metrics the fan and psu status are gathered.


    Another way may be configure and manage your system by using the Intelligent Platform Management Interface (IPMI).
    ipmitool -I lanplus -H myserver.example.com -P mypass sdr list Lists status of all sensors.
    ipmitool -I lanplus -H myserver.example.com -P mypass chassis status Checks the server status.
    Sensors

    IBM® Power Systems servers use a baseboard management controller (BMC) for system service management, monitoring, maintenance, and control. The BMC also provides access to the system event log files (SEL). The BMC is a specialized service processor that monitors the physical state of the system by using sensors. 
    Managing the system by using OpenBMC-based HMC (7063-CR2)


    ------------------------------
    Humberto Sosa
    ------------------------------



  • 7.  RE: Where to find power supply status, hardware sensor data?

    Posted Wed December 01, 2021 10:49 AM

    Have a look at Nigel's stuff

    AIXpert Blog from Nigel Griffiths (@mr_nmon)



    nextract Plus for HMC REST API Performance Statistics






    ------------------------------
    Bryan Dietz
    ------------------------------



  • 8.  RE: Where to find power supply status, hardware sensor data?

    Posted Mon December 06, 2021 09:18 AM
    I wanted to note that this is still unsolved. While it's nice that the HMC BMC may offer IPMI information, and there may be limited sensor data available from a REST API, I still have no functional way to check the health of the power supplies and the line status of my POWER systems. Ideally an AIX command would be nice.

    ------------------------------
    Russell Adams
    ------------------------------



  • 9.  RE: Where to find power supply status, hardware sensor data?

    Posted Tue December 07, 2021 09:23 AM
    Hi Russel,

      Fan and Power Supply status is available via the API mentioned above with hwinventory query parameter.

    Energy Metrics covers Power and Thermal metrics on supported systems (not supported on High-end Power Systems currently). 

    Can you please let us know on what additional information you are looking for ?

    Thanks.

    ------------------------------
    Hariganesh Muralidharan
    Cognitive Systems Management Architecture
    IBM
    ------------------------------



  • 10.  RE: Where to find power supply status, hardware sensor data?

    Posted Wed December 08, 2021 07:57 AM
    On Tue, Dec 07, 2021 at 02:22:42PM +0000, HARIGANESH MURALIDHARAN via IBM Community wrote:
    > Fan and Power Supply status is available via the API mentioned above with hwinventory query parameter.
    >
    > Energy Metrics covers Power and Thermal metrics on supported systems (not supported on High-end Power Systems currently).
    >
    > Can you please let us know on what additional information you are looking for ?

    I intend to go test the REST API, but I think perhaps this is a more
    fundamental problem.

    Why doesn't the HMC clearly show this status? Why isn't there a
    command in AIX or the HMC CLI to display these values?

    I have to drop down into web requests to a backend API that was opened
    for compatibility with some devops products to find out server status?
    Remember curl and wget aren't included with AIX, so either I have to
    goto a Linux box or write Perl code to query this.

    While I'll be thrilled if I can finally check this remotely, this is
    an unusual way to go about retrieving this information for a top of
    the line platform.

    ------------------------------------------------------------------
    Russell Adams Russell.Adams@AdamsSystems.nl
    Principal Consultant Adams Systems Consultancy
    http://adamssystems.nl/




  • 11.  RE: Where to find power supply status, hardware sensor data?

    Posted Thu December 09, 2021 08:33 AM
    Edited by Carl Gerlach Thu December 09, 2021 08:36 AM

    Sorry about the post, I see that the BMC portal has been previously discussed and doesn't resolve the request,.

    I found that if you don't need an automated monitoring of the HMC power and health status, the BMC portal for the HMC provides all of the information you might need.

    Sensor readings, CPU temps, system temp, DIMM temps.

    BMC Web interface Sensor readings


    Power Consumption Peaks, graphs, etc.

    HMC BMC Portal - Power Consumption


    Power Source status.

    HMC BMC - Power Source Readings



    ------------------------------
    Carl Gerlach
    ------------------------------



  • 12.  RE: Where to find power supply status, hardware sensor data?

    Posted Thu December 09, 2021 09:33 AM
    On Thu, Dec 09, 2021 at 01:33:31PM +0000, Carl Gerlach via IBM Community wrote:
    > I found that if you don't need an automated monitoring of the HMC
    > power and health status, the BMC portal for the HMC provides all of
    > the information you might need.

    Carl, I'm afraid there's a misunderstanding. I'm asking where on the
    HMC to find the power supply status and line in status for the POWER
    systems the HMC manages. The BMC of the HMC is thus not relevant.

    ------------------------------------------------------------------
    Russell Adams Russell.Adams@AdamsSystems.nl
    Principal Consultant Adams Systems Consultancy
    http://adamssystems.nl/




  • 13.  RE: Where to find power supply status, hardware sensor data?

    Posted Wed December 08, 2021 08:42 AM
    Hi Hariganesh,

    I am curious about the hwinventory query parameter.
    I can't find any references to this in the REST api documentation.

    Can you provide me with links or more details?


    Best regards,

    ------------------------------
    Mark Nellemann
    ------------------------------



  • 14.  RE: Where to find power supply status, hardware sensor data?

    Posted Thu December 09, 2021 01:01 AM
    Mark,

    Below is the HMC API that provides Power Supplies and Fans related information along with Status of each of these devices.

    https://<hmcip>/rest/api/uom/ManagedSystem/<UUID>?hwinventory=true


    And the response of ManagedSystem that shows PowerSupply and Fans details Looks something like this:

     <PowerSupplies ksv="V1_5_1" kxe="false" kb="ROO" schemaVersion="V1_0">

                    <Metadata>

                        <Atom/>

                    </Metadata>

                    <PowerSupply schemaVersion="V1_0">

                        <Metadata>

                            <Atom/>

                        </Metadata>

                        <LocationCode ksv="V1_5_1" kb="ROO" kxe="false">U78CA.001.CSS03DJ-E1</LocationCode>

                        <FruNumber ksv="V1_5_1" kb="ROO" kxe="false"> 00RR362</FruNumber>

                        <SerialNumber ksv="V1_5_1" kxe="false" kb="ROO">YL10KF51M027</SerialNumber>

                        <State ksv="V1_5_1" kb="ROO" kxe="false">StandbyOffline</State>

                        <Health ksv="V1_5_1" kb="ROO" kxe="false">Warning</Health>

                        <Description ksv="V1_5_2" kb="ROO" kxe="false">Modular PowerSupply</Description>

                        <MemberId ksv="V1_5_2" kxe="false" kb="ROO">1000</MemberId>

                    </PowerSupply>

                    <PowerSupply schemaVersion="V1_0">

                        <Metadata>

                            <Atom/>

                        </Metadata>

                        <LocationCode ksv="V1_5_1" kb="ROO" kxe="false">U78CA.001.CSS03DJ-E2</LocationCode>

                        <FruNumber ksv="V1_5_1" kb="ROO" kxe="false"> 00RR362</FruNumber>

                        <SerialNumber ksv="V1_5_1" kxe="false" kb="ROO">YL10KF51M052</SerialNumber>

                        <State ksv="V1_5_1" kb="ROO" kxe="false">StandbyOffline</State>

                        <Health ksv="V1_5_1" kb="ROO" kxe="false">Warning</Health>

                        <Description ksv="V1_5_2" kb="ROO" kxe="false">Modular PowerSupply</Description>

                        <MemberId ksv="V1_5_2" kxe="false" kb="ROO">1001</MemberId>

                    </PowerSupply>

                    ……

                    …..

        </PowerSupplies>

         <FANs ksv="V1_5_1" kxe="false" kb="ROO" schemaVersion="V1_0">

                    <Metadata>

                        <Atom/>

                    </Metadata>

                    <FAN schemaVersion="V1_0">

                        <Metadata>

                            <Atom/>

                        </Metadata>

                        <LocationCode ksv="V1_5_1" kxe="false" kb="ROO">U78CA.001.CSS03DJ-A1</LocationCode>

                        <FruNumber ksv="V1_5_1" kb="ROO" kxe="false">00E9335</FruNumber>

                        <SerialNumber ksv="V1_5_1" kb="ROO" kxe="false">YL1424GDPUML</SerialNumber>

                        <State ksv="V1_5_1" kb="ROO" kxe="false">StandbyOffline</State>

                        <Health ksv="V1_5_1" kb="ROO" kxe="false">Warning</Health>

                        <Description ksv="V1_5_2" kxe="false" kb="ROO"/>

                        <MemberId ksv="V1_5_2" kxe="false" kb="ROO">2104</MemberId>

                    </FAN>

                    ………

                    ………

          </FANs>



    ------------------------------
    Sridevi Joshi
    ------------------------------