Patch Name: PHKL_12781 Patch Description: s700_800 10.30 HP-PB Fast Wide SCSI cumulative patch Creation Date: 97/10/13 Post Date: 97/10/15 Hardware Platforms - OS Releases: s700: 10.30 s800: 10.30 Products: N/A Filesets: OS-Core.CORE-KRN Automatic Reboot?: Yes Status: General Superseded Critical: Yes PHKL_12781: PANIC PHKL_11934: PANIC HANG CORRUPTION Path Name: /hp-ux_patches/s700_800/10.X/PHKL_12781 Symptoms: PHKL_12781: DSDe439707 System panics during heavily loaded SCSI bus conditions. Specifically during slow device/low priority SCSI device recovering from time-out conditions. PHKL_11934: (4701355974/DSDe436855) A customer may see error 31 and error 76 in the diagnostics logs for scsi1. For scsi3: odd address transfers may cause a panic. Odd length transfers may cause corruption in the byte after the end of the buffer passed in. Also if a transfer is > 64K and is an odd length transfer then it may cause the system to eventually panic. This is no longer allowed. (4701359083/DSDe433815) An ioscan to an STK9710 tape library unit on an HP-PB F/W/D (scsi3) scsi bus will hang. (1653195172/DSDe432904) During initializations of new DLT tapes using the Omniback backup utility, DLT tape drives will hang and the scsi bus will be reset (to recover from the hang). (5003351015/DSDe433081) I/O's that start on odd-aligned addresses will fail. (4701351619/DSDe434446) Under heavy load some scsi commands will fail. (4701351361/DSDe432493) The system will hang and scsi bus resets with error code 103 and/or 104 will occur. This problem is readily observed on DLT tape drives under a light load and a variety of disc drives under heavy load. (4701351775/DSDe435545) Accesses to the magneto-optical auto changer will hang. Defect Description: PHKL_12781: DSDe439707 The Panic/data page fault was localized to a SCSI message queue bug in the s3_send_reply/s3_send_abort_reply call sequence during highly utilized bus conditions with both high/low priority devices contending for I/O requests simultaneously. PHKL_11934: (4701355974/DSDe436855) scsi3: odd length transfers cause corruption, odd address transfers may cause a panic, and odd length transfers > 64 K may cause a panic as well. To reproduce, I wrote a test program that loops through even and odd transfers starting at 1 byte and ending at some number of K. Disks should not be affected as data must be transferred in block sizes and so must be even (4701359083/DSDe433815) There were three seperate problems that appear to be unique to the STK9710 tape library unit on the HP-PB F/W/D scsi bus: 1) During an ioscan of the bus w/ STK9710 auto changer an ioscan to a non-existent device would time-out as opposed to failing with a selection time-out. The driver would initiate an auto abort of the request which would hang in driver after it failed w/ selection time-out. (Fix: Treat it like a probe w/ selection time-out.) 2) If an abort failed w/ an error. The bus was reset w/ error code 75. (Fix: Retry) 3) The STK9710 auto changer is an asynchronous/narrow device. It rejects all of the extended scsi messages used to negotiate synchronous/wide. The driver does not have adequate protections for negotiations in process. The driver state machine gets lost during the lengthy negotiations. (Fix:After a device has negotiated asynch/narrow. We don't allow host initiated negotiations until after a bus-reset, bus device reset or powerfail.) (1653195172/DSDe432904) The problem was corrected by increasing the timer value that controls sending commands to scsi3. (5003351015/DSDe433081) All I/O's starting on odd-aligned addresses will fail.(scsi3 driver fails the request). The problem was corrected by copying the data from the odd aligned buffer to an even aligned buffer. (4701351619/DSDe434446) Under heavy load some commands like inquiry, request sense, and test unit ready may fail. The commands are allocated on the stack. If the stack space is freed, the command will get corrupted, resulting in an invalid command check condition from the device. (4701351361/DSDe432493) The system will hang and SCSI Bus Reset 103's and 104's will be observed. Sometimes, timer messages would not get sent to the requestor if the timer "popped" before the SIO timer services completely setup the timer. If a device returns more data than requested to scsi3, generating a "data over-run" condition, scsi3 will hang. Data over-runs are caused by "bugs" in the device firmware. A bus reset 125 was added to clean up the resulting bus hang. (4701351775/DSDe435545) Attempts to access the magneto-optical disc changer will hang because the scsi3 driver does not wait long enough for the changer to drop the scsi bus REQ line. SR: 1653195172 4701351361 4701351619 4701351775 4701355974 4701359083 4701370577 5003351015 Patch Files: /usr/conf/lib/libhp-ux.a(scsi3.o) /usr/conf/lib/libhp-ux.a(sio_drivers3.o) what(1) Output: /usr/conf/lib/libhp-ux.a(scsi3.o): scsi3.c $Date: 97/10/09 14:35:46 $ $Revision: 1.11.102.16 $ PATCH_10.30 (PHKL_12781) /usr/conf/lib/libhp-ux.a(sio_drivers3.o): sio_drivers3.o $Date: 97/07/29 13:23:19 $ $Revision : 1.3.102.3 $ PATCH_10.30 (PHKL_11934) cksum(1) Output: 2527538535 83676 /usr/conf/lib/libhp-ux.a(scsi3.o) 2869730517 138284 /usr/conf/lib/libhp-ux.a(sio_drivers3.o) Patch Conflicts: None Patch Dependencies: None Hardware Dependencies: None Other Dependencies: None Supersedes: PHKL_11934 Equivalent Patches: PHKL_12778: s800: 10.01 PHKL_12779: s800: 10.10 PHKL_12780: s800: 10.20 Patch Package Size: 280 KBytes Installation Instructions: Please review all instructions and the Hewlett-Packard SupportLine User Guide or your Hewlett-Packard support terms and conditions for precautions, scope of license, restrictions, and, limitation of liability and warranties, before installing this patch. ------------------------------------------------------------ 1. Back up your system before installing a patch. 2. Login as root. 3. Copy the patch to the /tmp directory. 4. Move to the /tmp directory and unshar the patch: cd /tmp sh PHKL_12781 5a. For a standalone system, run swinstall to install the patch: swinstall -x autoreboot=true -x match_target=true \ -s /tmp/PHKL_12781.depot 5b. For a homogeneous NFS Diskless cluster run swcluster on the server to install the patch on the server and the clients: swcluster -i -b This will invoke swcluster in the interactive mode and force all clients to be shut down. WARNING: All cluster clients must be shut down prior to the patch installation. Installing the patch while the clients are booted is unsupported and can lead to serious problems. The swcluster command will invoke an swinstall session in which you must specify: alternate root path - default is /export/shared_root/OS_700 source depot path - /tmp/PHKL_12781.depot To complete the installation, select the patch by choosing "Actions -> Match What Target Has" and then "Actions -> Install" from the Menubar. 5c. For a heterogeneous NFS Diskless cluster: - run swinstall on the server as in step 5a to install the patch on the cluster server. - run swcluster on the server as in step 5b to install the patch on the cluster clients. By default swinstall will archive the original software in /var/adm/sw/patch/PHKL_12781. If you do not wish to retain a copy of the original software, you can create an empty file named /var/adm/sw/patch/PATCH_NOSAVE. Warning: If this file exists when a patch is installed, the patch cannot be deinstalled. Please be careful when using this feature. It is recommended that you move the PHKL_12781.text file to /var/adm/sw/patch for future reference. To put this patch on a magnetic tape and install from the tape drive, use the command: dd if=/tmp/PHKL_12781.depot of=/dev/rmt/0m bs=2k Special Installation Instructions: None