Overview of the OCC for POWER8

By Todd J Rosedahl posted Wed June 17, 2020 03:34 PM

Overview of OCC
For POWER8 Systems, a new piece of hardware (& associated firmware) was introduced called the On Chip Controller (OCC).  The OCC is a separate 405 processor that is embedded directly on the chip along with the main POWER processor cores.   It has its own dedicated 512K SRAM, access to main memory, and 2 dedicated General Purpose off-load Engines (called GPEs).  

The main OCC firmware runs a 250usec loop that utilizes the GPEs to continuously collect system data.  It uses that data to keep the system under power/thermal limits and running optimally based on user inputted modes and parameters.  The OCC has access to detailed temperature, power, and utilization data, as well as complete control of processor frequency and memory bandwidth.   This enables improvements in performance and energy management, and provides additional system reliability and availability.   Users have the flexibility to enable modes that dramatically increase system performance, reduce power consumption, and maximize system performance per watt metrics.
PowerVM OCC Overview
Features/Capabilities built in to the system as base function
Energy and Temperature information
The OCC provides access to system power and temperature information. This information can be accessed via Common Information Model (CIM) and the Intelligent Platform Management Interface (IPMI) and is useful for operators to tune the power/cooling distribution for efficiency and to look for potential issues.

Power Capping
The OCC enforces a system power limit that improves system availability by ensuring that the system never exceeds designed power limits.

System Availability
The OCC automatically detects power supply failures and AC input losses and responds within milliseconds by lowering the system power consumption to enable the system to run through such events.

System Reliability
The OCC is used to keep component temperatures within reliability limits, extending device lifetime, and limiting service costs.

Features/Capabilities that can be enabled/disabled/modified by users

Performance Boost
The POWER processors can be set to frequencies above nominal.   The OCC monitors the system and controls the processor frequency and memory bandwidth to keep the system thermally safe and within acceptable power limits.

Energy Saving
When the system utilization is low, the OCC infrastructure can be used to put the system into a low power state.  One available mode, called "Idle Power Saver", can be tuned by the user to act at various processor utilization levels and idle times in order to match system usage patterns.

Performance per Watt tuning
Modes such as "Dynamic Power Saver" can be enabled by the user.  As the system utilization varies, the OCC controls the processor frequency to maximize system performance per watt metrics.  Algorithm tuning parameters are exposed such that users can modify the OCC behavior to match their specific needs and workloads.

User Power Capping
This feature enables a user to set a system power limit.   The OCC will continually monitor the power consumption and will reduce the allowed processor frequency to maintain that power limit.  Note the this is a separate limit from the system power cap described above and the OCC always enforces the lower of the two limits.

Note that support for these features are system dependent and not all features are available on all systems.  See the white paper below for details.

Refer to the EnergyScale White Paper (http://www-01.ibm.com/common/ssi/cgi-bin/ssialias?htmlfid=POW03125USEN) for more information on the various modes supported by the OCC and instructions on enabling the user controlled features. 

Contacting the PowerVM Team
Have questions for the PowerVM team or want to learn more?  Follow our discussion group on LinkedIn IBM PowerVM or IBM Community Discussions