Disabled Paths Alert Help


Overview

The Disabled Paths Alert dialog box pops up into view when any path in the cluster transitions from the enabled state to the disabled state. The alert dialog box appears regardless of the NCMS application that is currently displayed.

It is important to repair the failing path(s) and return the ServerNet SAN to fully fault-tolerant operation so that a subsequent path failure does not jeopardize access to cluster resources. Refer to  Repairing the Disabled Path for troubleshooting guidelines.


Responding to the Dialog

The dialog box contains an OK button and a Help button.

Repairing the Disabled Path

A disabled (failed) ServerNet SAN path indicates a hardware problem. The failure may be caused by cabling, the ServerNet SAN switch, or the SPAs (in the nodes) providing the path. The following diagram shows a case where the link between node 2 and the X ServerNet SAN switch has sustained a failure. Each link in the diagram is bidirectional, with a transmitter and receiver at each end of each link.

Use the ServerNet States View to help isolate the failure and force ServerNet SAN communications to the failure-free (all paths enabled) fabric so the hardware in the disabled path(s) can be evaluated without interrupting operation of the cluster.

In this example, the ServerNet States View shows that the X paths between nodes 1 and 2 and nodes 3 and 2 are disabled. The X path between nodes 1 and 3 is enabled. The hardware failure may be a loose or broken cable between the X switch and node 2, a problem with the SPA in node 2, or the X switch port that connects to node 2. By forcing all communications to the Y fabric (network), the suspect cable and the X switch can be evaluated for problems while the cluster continues to run. Node 2 must be removed from the cluster to test its SPA with the ServerNet SAN offline diagnostics.

If all paths in one fabric become disabled, the associated switch is a likely suspect (its power may have been turned off). The key to identifying the suspect hardware is to first identify all disabled paths in your cluster so you can determine what hardware must still be operational.

In addition to the ServerNet States View, the Link Exceptions view should be used to look at the link exception counts for the suspect SPA(s). See Link Exceptions Help for an explanation of the link exception types and what they mean.

The SPA driver (SPAD) messages written to the system console and system log for the suspect node are also useful for isolating the failure that led to the disabled path(s). The NonStop Clusters for SCO UnixWare System Administrator's Guide contains instructions for running the offline diagnostics and troubleshooting information for each of the ServerNet SAN messages written by the SPAD.