View Only

POWER9 EnergyScale - Configuration and Management

By Todd J Rosedahl posted Wed June 17, 2020 03:35 PM

PowerVM EnergyScale LogoWe previously published a blog that gave an overview of POWER9 EnergyScale.  This is the second blog that covers how to configure and manage energy management on the server.

Changing the Energy Management Mode
The Energy Management mode can be changed in multiple ways; using the HMC GUI, using the HMC command line, or by the ASMI menus.  All three options provide the same level of functionality.  Note that changing the modes is a dynamic operation, it does not require a reboot, and takes effect immediately. 

The Power Mode Setup panel can be display by selecting the server, select "Actions", select "View All Actions" and select "Power Management".  This will display the following panel:
PowerVM EnergyScale HMC Setup

HMC Command Line
The HMC commands are listed below for those interested in scripting.
# chpwrmgmt -m <managed system name> -r sys -o enable -t <mode>

Where <mode> can be one of the supported modes from   lspwrmgmt -m <managed system name> -r sys -F supported_power_saver_mode_types
Or to disable them and get nominal static:
# chpwrmgmt -m <managed system name> -r sys -o disable

Advanced System Manager Interface (ASMI)
The figure below shows the ASMI menu options with the arrows pointing to the key selections.
PowerVM energyScale ASMI OptionsIn addition to the four modes mentioned, there is the option of turning on Idle Power Saver.  With Idle Power Saver enabled, the frequency will be reduced after long periods of complete system idleness (minutes).  There are settings for idle power delay time and idle usage thresholds.  Whether Idle Power Saver is on or off, the Dynamic Performance mode will still drop to the minimum frequency when there is little or no workload for milliseconds.  When in Maximum Performance mode or the nominal mode, the frequency will only drop if Idle Power Saver is turned on.  Further information on Idle Power Saver can be found at this web site: https://www.ibm.com/support/knowledgecenter/5148-22L/p8hby/ideal_power.htm. 

Measuring the frequency from AIX
The recommended method for measuring core frequency in AIX is to use mpstat -E 1 1 or lparstat -E 1 1.  The command mpstat -E 1 1 provides individual core frequencies, while lparstat -E 1 1 averages frequencies across all the cores in the lpar.

The lparstat –E 1 1 command will show the current power saver mode and the current frequency:
PowerVM energyScale lparstat
Documentation for the lparstat and mpstat commands is available here:

Note: The pmcycles command in AIX should NOT be used for reading the current processor frequency. The recommended approach is to use lparstat - E 1 1 and mpstat -E 1 1 commands as discussed above.

Measuring the frequency from Linux
For Linux there are several ways to look at CPU frequencies depending on the flavor of Linux being used.  The command, lscpu will show the running frequency on some flavors of Linux, but not all.  The command dmesg (dmesg | grep freq) will also provide frequency information.  The file /proc/cpuinfo contains frequency information

On some flavors of Linux, the commands below can be used.
Nominal frequency range:
cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies

Energy Scale Frequency range:
cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_boost_frequencies

Current running frequency of any core:
cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq
Note that in a purely Linux environment (with OPAL), the Linux operating system sets the frequency on a per core basis.  In this case, the frequency is capped by the POWER On Chip Controller (OCC).

Measuring the frequency from IBM i
For the IBM i operating system, the following methods can be used to collect performance information.

IBM iDoctor for IBM i displays the CPU rate for the IBM i partition over time on the Collection Overview graph.  The CPU rate for the partition is the ratio of scaled to unscaled processor utilized time, expressed as a percentage.  There are two hardware registers that provide energy scaling information; the PURR and the SPURR.  The Processor Utilization Register (PURR) is incremented monotonously as work is performed on a processor.  The Scaled Processor Utilization register (PURR) is scaled in relation to the current frequency.  If the processor is running at nominal frequency, the PURR and the SPURR will accumulate cycles at the same frequency.  If the frequency is running higher than the nominal frequency, the ratio of the SPURR to the PURR will correspond to the increase in frequency.  The processor utilized time reported by IBM i is the accumulation of non-idle virtual processor SPURR and PURR over each time interval.

The Work with System Activity (WRKSYSACT) command displays the Average CPU rate.  The Average CPU rate for the partition is the ratio of scaled to unscaled processor utilized time, expressed as a percentage.  The processor utilized time is the accumulation of non-idle virtual processor SPURR and PURR for the interval since the last refresh.

PowerVM energyScale wrksysact

IBM i Collection Services Database file QAPMJOBMI contains time series data by task, primary thread, and secondary thread.  Scaled and unscaled CPU times are available to calculate average CPU rate for processing activity of tasks and threads.

Database file QAPMSYSTEM contains time series system-wide (i.e. partition) accumulations of performance data.  Scaled and unscaled CPU times are accumulated for various categories of processor usage.  The ratio of scaled to unscaled time is the average CPU rate for the category of time accumulation.  The processor utilized time is the accumulation of non-idle virtual processor SPURR and PURR for the time interval.

As of IBM i 7.3, the QAPMCONF database file key "NF" contains the processor nominal frequency in MHz.  The processor nominal frequency can be used to convert average CPU rate to average processor frequency.

Frequency Range for POWER9 scale-out servers
The chart below lists the generally availability frequencies for the newly announced POWER9 scale-out systems.  Note that these frequency values may change
PowerVM energyScale frequencies
Varying frequencies is not new to Power systems, but new options are available, and the default mode that the system runs in has changed.  What drove the change for the default mode? Why leave all that performance on the table, when the system is capable of so much more.

POWER9 EnergyScale gives you the flexibility to optimize your server to match the performance or energy requirements of your data center.

Contacting the PowerVM Team
Have questions for the PowerVM team or want to learn more?  Follow our discussion group on LinkedIn IBM PowerVM or IBM Community Discussions