Link Exceptions Help


Overview

A link exception occurs when the SAIL ASIC on a SPA detects a problem receiving data on its ServerNet SAN X or Y port. If there is a transmission error during normal operation, the receiving SPA reports the error back to the transmitting SPA in the form of a "This Link Bad," or TLB command symbol. In this way, SPAs learn of transmission and reception problems. Link exceptions can also occur for a brief period when nodes are added to or removed from a cluster.

When a link exception occurs, the SAIL ASIC captures the cause of the exception in a bit in one of its two link exception registers. There is one link exception register for the X port and one for the Y port. Each type of link exception has a corresponding bit in these registers. When the SPA driver (SPAD) takes an interrupt for a link exception, it reads the link exception registers, increments its count for any link exception it finds flagged there, and then clears the link exception registers in preparation for the next link exception.

The hardware domain of a link exception is limited to the ServerNet SAN ports at each end of the link and the physical link (cable) itself. This means that when a SPA records a link exception, the associated hardware problems are limited to either the SPA, the ServerNet SAN cable connecting the SPA to the node at the other end of the link, or the ServerNet SAN ports on the node at the other end of the link. The node at the other end of the link may be a ServerNet SAN switch (router node) or (in the case of a two-node cluster) another end node (computer) equipped with a SPA. The accurate recording and counting of all link exceptions detected by the hardware also depends on proper operation of the SPAD and the firmware in the ServerNet SAN switch.

The SPAD resets link exception statistics to zero for its SPA on every boot of the node containing the SPA. In addition, the Link Exceptions view can be used by those with root permission to reset the entire set of link exception statistics for a given port (X or Y) on a SPA or for a given fabric (X or Y) in a cluster.

NOTE: There are several types of link exceptions, as explained under Link Exception Types. If the Second link exception count is nonzero, the exception counts for the associated X or Y port are approximate. See Second (link exception type) for details.


Link Exception Types

The following link exception types are tracked for the X and Y ports, on both a SPA and cluster basis, in the Link Exception view:

Viewing Statistics

The link exception statistics appear in two panes: Associated with the For SPA N pane is a SPA choice box used to select the SPA containing the statistics to be retrieved.

To view the link exception statistics for a particular SPA, select the SPA number with the SPA choice box; the statistics for the selected SPA are displayed. The total link exception statistics for the cluster on a fabric basis are continuously displayed in the bottom pane.


Resetting Statistics

Each of the two panes have a Reset X and Reset Y button for users with root permission.