Patch Name: PHKL_19130 Patch Description: s700 10.20 WSIO SCSI cumulative patch Creation Date: 99/07/22 Post Date: 99/07/23 Warning: 99/08/27 - This Non-Critical Warning has been issued by HP. - PHKL_19130 introduces a problem with the WSIO SCSI disk driver with devices in an LVM configuration. When a powerfail occurs in the path to a device, for example the device or Fibre Channel-SCSI MUX powerfails, the LUNs on the device will be inoperative. Accesses to the LUNs will result in I/O errors. - Once the path to the device is restored, issuing the command 'scsictl -a' on each LUN will restore availability. - HP recommends that PHKL_19130 be removed from all systems with LVM configurations on which it has been installed. PHKL_19130 should also be removed from all software depots. - The superseded patch, PHKL_19097, does not exhibit this problem. PHKL_19097 will be re-released until a replacement patch is available. HP recommends that PHKL_19097 be installed after PHKL_19130 is removed. Hardware Platforms - OS Releases: s700: 10.20 Products: N/A Filesets: OS-Core.CORE-KRN ProgSupport.C-INC Automatic Reboot?: Yes Status: General Superseded With Warnings Critical: Yes PHKL_19130: PANIC PHKL_19097: PANIC PHKL_18917: HANG PHKL_18390: HANG PHKL_17467: HANG PHKL_16861: HANG PHKL_16926: PANIC Path Name: /hp-ux_patches/s700/10.X/PHKL_19130 Symptoms: PHKL_19130: DTS#: JAGaa42584 SR#: 1653281824 system panic with "scsi unrecovered deferred error" DTS#: JAGaa44446 SR#: 8606101359 command-mode might stop working at any arbitrary time with respect to the application and device trying to use it. DTS#: JAGaa08513 SR#: 8606101473 While doing an Inquiry command to request Unit Serial Number Page, one extra byte is transfered. There are no real symptoms associated with this problem. PHKL_19097: System panics (Data page Fault) in scsi_start_bus_locked() PHKL_18917: LVM hangs due to I/O requests never being returned by the IO subsystem. The message "Device violation of Contingent Allegiance" is issued to syslog. PHKL_18390: SR:1653300004 DTS: JAGaa47696 (dup of JAGab11155) Slow PVlink failover after installing PHKL_17467. Diskinfo reports back on an unavailable disk. SR:1653300970 DTS:JAGab11365 SR:1653290395 DTS: JAGaa47016 A faulty disk can prevent the LVM mirroring from working. PHKL_17467: I/O failover hang on Fiber Channel PV_link. PHKL_16861: I/O failover hang on Fiber Channel PV_link. PHKL_17639: This patch enables new functionality that is part of the 10.20 ACE (Additional Core Enhancements) Workstation bundle, which adds new I/O drivers to support the B1000, C3000, and J5000 systems. PHKL_16926: SR:5003434118 DTS:JAGaa23967 System panics (Data Page Fault) in scsi_destroy_scb SR:5003429654 DTS:JAGaa40369 System panics in c720_invalid_req_done SR:4701407890 DTS:JAGaa23080 Unexpected Disconnect Messages when using pass through driver Defect Description: PHKL_19130: DTS#: JAGaa42584 SR#: 1653281824 If immediate reporting is enabled and a deferred error occurs, the system will panic with "scsi unrecovered deferred error". Resolution: The new deferred error check/handling method is to block all IO requests for the disk, when a deferred error occurs, until the device is closed and reopened. DTS#: JAGaa44446 SR#: 8606101359 scsi_ctl replaces the cdevsw table entries for d_read and d_write when the lun is not in command-mode for performance improvements. The problem is that the cdevsw table is a global resource and is not owned by a lun and command-mode might stop working at any arbitrary time. Resolution: Removed that code. DTS#: JAGaa08513 SR#: 8606101473 the FC data length exceeds the maximum SCSI transfer length by 1 byte while performing an Unit Serial Number Page. Resolution: Reduce the size of the scsi serial structure by 1. PHKL_19097: When sd_open() fails in scsi_lun_open, we goto recover_lck1 which falls through to recover_lp. recover_lp sets lp->ddsw to NULL but fails to set lp->scb_q_nonempty to NULL. This causes a data page fault panic in scsi_start_bus_locked(). This might occur when an open() fails on a busy device. Resolution: Set lp->scb_q_nonempty to NULL in label recover_lp in scsi_lun_open(). PHKL_18917: When the message is issued (typically caused by a bus RESET during contingent allegiance condition (CAC)), the corresponding I/O request is then lost and never returned to the requestor, eventually causing a system hang. Resolution: When a bus RESET happens during a CAC, the c720 driver now insures that all currently active I/O requests are posted as incomplete and scheduled to be retried. PHKL_18390: SR:1653300004 DTS: JAGaa47696 (dup of JAGaa11155) Slow PVlink failover or diskinfo reporting good disk status on an unavailable disk is due to the SCSI INQUIRY command returning cached data instead of sending the command down to the device." Resolution: We now ensure an INQUIRY command will be sent down to the device when the disk becomes nonresponsive. SR:1653300970 DTS:JAGab11365 and SR:1653290395 DTS:JAGaa47016 If a faulty disk sends NOT_READY sense key to SCSI. The current SCSI policy is to retry the request until the disk is ready. This results in a hang IO situation and prevents the LVM mirroring from working. Resolution: LVM-related NOT_READY requests will be treated as nonresponse from the disk and will therefore be failed back for LVM to handle. PHKL_17467: In a hardware configuration, mirrored disks can be accessed through primary/alternate Fiber Channel (FC) links. If the primary link and the alternate link of a disk of the mirrored pairs are down, the other disk should continue to sending or receiving data. The problem is it fails to do so and causes an I/O hang. Resolution: This patch provides fix for this hang problem. The SCSI layer will retry the FC requests as long as the PFTIMEOUT period has not expired and the request is recoverable. PHKL_16861: In a hardware configuration, mirrored disks can be accessed through primary/alternate Fiber Channel (FC) links. If the primary link and the alternate link of a disk of the mirrored pairs are down, the other disk should continue to sending or receiving data. The problem is it fails to do so and causes an I/O hang. This patch will provide a temporary fix for this problem. In this fix, the SCSI layer will retry the FC request as long as the FC sets a flag to ask for retrying the request. PHKL_17639: New functionality to support the B1000, C3000, and J5000 systems on HP-UX 10.20. New functionality adds new I/O drivers. Resolution: Add support for new SCSI hardware in the SCSI driver. PHKL_16926: SR:5003434118 DTS:JAGaa23967 There is a race condition between scsi_lun_open and scsi_start_bus_locked. This can be fixed by incrementing the in_use counter before releasing the lun lock therefore insuring the lun stay open. SR:5003429654 DTS:JAGaa40369 In c720_invalid_req_done, we directly dereference scb->busp without assuring that this scb is a bus scb. The busp pointer is NULL if the scb is a lun scb. Thus, the fix is to add a check to see whether lsp->scb->busp is NULL, if so, obtain the busp from lsp->scb->lp->bus instead. SR:4701407890 DTS:JAGaa23080 When using the pass through driver with the "inhibit Inquiry on open" option (see scsi_ctl(7)) and a device on a SCSI bus with no other devices and repeatedly opening and closing the device to send but a single SCSI command, the bus is sometimes in the wrong state when the target device begins to transfer data. SR: 1653281824 1653290395 1653300004 1653300970 1653306654 4701398263 4701407668 4701407890 4701414136 5003429654 5003434118 5003464297 Patch Files: /usr/conf/lib/libhp-ux.a(scsi_c720.o) /usr/conf/lib/libhp-ux.a(scsi_ctl.o) /usr/conf/lib/libhp-ux.a(scsi_disk.o) /usr/include/sys/scsi_ctl.h what(1) Output: /usr/conf/lib/libhp-ux.a(scsi_c720.o): scsi_c720.c $Date: 99/06/18 13:25:35 $ $Revision: 1.5.98.41 $ PATCH_10.20 (PHKL_18917) scsi_c720.c $Date: 99/06/18 13:25:35 $ $Revision: 1. 5.98.41 $ /usr/conf/lib/libhp-ux.a(scsi_ctl.o): scsi_ctl.c $Date: 99/07/07 17:19:45 $ $Revision: 1 .9.98.39 $ PATCH_10.20 (PHKL_19130) /usr/conf/lib/libhp-ux.a(scsi_disk.o): scsi_disk.c $Date: 99/07/07 17:34:58 $ $Revision: 1.7.98.35 $ PATCH_10.20 (PHKL_19130) /usr/include/sys/scsi_ctl.h: scsi_ctl.h $Date: 99/07/08 08:03:22 $ $Revision: 1.8 .98.12 $ PATCH_10.20 (PHKL_19130) cksum(1) Output: 3432856428 97956 /usr/conf/lib/libhp-ux.a(scsi_c720.o) 925472773 67924 /usr/conf/lib/libhp-ux.a(scsi_ctl.o) 2545284897 20680 /usr/conf/lib/libhp-ux.a(scsi_disk.o) 3053296693 52456 /usr/include/sys/scsi_ctl.h Patch Conflicts: None Patch Dependencies: s700: 10.20: PHKL_16750 Hardware Dependencies: None Other Dependencies: None Supersedes: PHKL_16861 PHKL_16926 PHKL_17467 PHKL_17639 PHKL_18390 PHKL_18917 PHKL_19097 Equivalent Patches: PHKL_19131: s800: 10.20 Patch Package Size: 300 KBytes Installation Instructions: Please review all instructions and the Hewlett-Packard SupportLine User Guide or your Hewlett-Packard support terms and conditions for precautions, scope of license, restrictions, and, limitation of liability and warranties, before installing this patch. ------------------------------------------------------------ 1. Back up your system before installing a patch. 2. Login as root. 3. Copy the patch to the /tmp directory. 4. Move to the /tmp directory and unshar the patch: cd /tmp sh PHKL_19130 5a. For a standalone system, run swinstall to install the patch: swinstall -x autoreboot=true -x match_target=true \ -s /tmp/PHKL_19130.depot By default swinstall will archive the original software in /var/adm/sw/patch/PHKL_19130. If you do not wish to retain a copy of the original software, you can create an empty file named /var/adm/sw/patch/PATCH_NOSAVE. WARNING: If this file exists when a patch is installed, the patch cannot be deinstalled. Please be careful when using this feature. It is recommended that you move the PHKL_19130.text file to /var/adm/sw/patch for future reference. To put this patch on a magnetic tape and install from the tape drive, use the command: dd if=/tmp/PHKL_19130.depot of=/dev/rmt/0m bs=2k Special Installation Instructions: This patch depends on base patch PHKL_16750. For successful installation, please ensure that PHKL_16750 is in the same depot with this patch, or PHKL_16750 is already installed.