The T11 Fibre channel standards group has introduced a new fabric congestion notification mechanism in FC-LS-5, which adds a new Extended Link Service (ELS), Fabric Performance Impact Notification (FPIN). When the fabric detects congestion/link issues, the fabric sends an FPIN ELS to all N_Ports that registered to receive FPINs. Thus N_Ports (such as HBAs) must register to receive FPIN ELS. FPIN ELS provides three categories of notifications/events from the fabric:
-
Congestion – indicates a link is overused. This event may be generated repeatedly until congestion subsides.
-
Link Incident – indicates a threshold has been exceeded for the link: such as CRC errors etc.
-
Discarded FC frame(s) – indicates the fabric has dropped frame(s) to specific targets.
In October 2020, Brocade added support for FPIN ELS via FOS 9.0 and higher in their switches. The new AIX 7.2 TL 5 and VIOS 3.1.2 also add support for FPIN ELS on all 16Gb (and faster) FC adapters. This new support includes AIX 7.2 TL 5 NPIV clients, provided that client is attached to VIOS 3.1.2.
SETUP
AIX/VIOS will automatically check for the FPIN ELS support in the fabric and if it is available will register to receive FPINs. Thus there are no changeable settings in AIX required to enable FPIN support: it is automatic.
MPIO (Multi-path I/O) support
FPIN ELS for congestion and link incident events are passed to the AIX MPIO (Mult-path I/O) layer's Active/Active PCM (Path Control Module), which is shipped in base AIX. In general the Active/Active PCM will treat impacted paths as “Degraded” paths, meaning that it selects other paths for I/O whenever there are other paths that are not degraded. Furthermore the lsmpio command has been enhanced to display the following new values in the extended path_status field to indicate these events/states:
-
LCn – link between HBA and switch is congested
-
PCn – Link between switch and storage target port is congested
-
PDg – A link experienced a link incident event (i.e. too many CRC errors etc)
For congestion events, the Active/Active PCM automatically clears the congestion indication on a path if congestion notifications for that path are not reported after a certain interval. A link incident event is cleared by the Active/Active PCM when a link bounce for that link has been detected.
Here is a link to a youtube video demo of FPIN from Brocade using AIX 7.2. TL 5: https://www.youtube.com/watch?v=RNoMMfviJ-Q&feature=youtu.be