Patch Name: PHNE_9550 Patch Description: s700_800 10.10 SNAplus Link R4.3 cumulative patch Creation Date: 97/04/11 Post Date: 97/04/23 Hardware Platforms - OS Releases: s700: 10.10 s800: 10.10 Products: SNAP-LINK R4.3 Filesets: SNAP-LINK.SNAP-LINK Automatic Reboot?: Yes Status: General Superseded Critical: Yes PHNE_9550: HANG PHNE_7360: PANIC Path Name: /hp-ux_patches/s700_800/10.X/PHNE_9550 Symptoms: PHNE_9550: (1) 1653112474 SNAplus connections over LAN links fails to detect link outages (2) 1653153957 Only one Session Binds successfully when multiple sessions try to use the APPC default LU pool. (3) 1653179788 A kernel panic with the following stack trace uniquely identifies this problem: panic+0x10 report_trap_or_int_and_panic+0x8c trap+0xbf0 $RDB_trap_patch+0x20 s1pupbnd+0xd4 s1pucsc+0x2f4 s1pusvc+0x3e4 s1pgdisp+0x20c (4) 1653180125 panic: (display==0xbf00, flags==0x0) Data segmentation fault Stack as follows starting sp=0x7ffe6fc0 panic+0x1c ( arguments not stored ) pc=0x2625c0, pfmp=0x7ffe6f60, psp=0x7ffe6f80 trap+0xaac ( arguments not stored ) pc=0x1c15f0, pfmp=0x7ffe6ea0, psp=0x7ffe6ec0 trap marker save state 0x7ffe6c90 sp 0x7ffe6ec0 framesize 0x230 s1pxabnd+0x1b1 ( 0x7ffe1712 ,0x003d0002 ,0x00000000 ,0x7ffe0149 ) pc=0x147ffc, pfmp=0x7ffe6bf0, psp=0x7ffe6c10 s1pxsnd+0x8c9 ( arguments not stored ) pc=0x144f64, pfmp=0x7ffe6b70, psp=0x7ffe6b90 s1pgdisp+0x385 ( 0x004d0028 ,0x7ffe6afc ,0x7ffe0001 ,0x660000fd ) pc=0xe3e94, pfmp=0x7ffe6b30, psp=0x7ffe6b50 sna_1_sbpsched+0x10d ( arguments not stored ) pc=0x158ca8, pfmp=0x7ffe6ab0, psp=0x7ffe6ad0 sna_1_sbpikusv+0x55 ( 0x01263074 ,0x00000000 ,0x00000098 ,0x00004321 ) pc=0x152518, pfmp=0x7ffe6a70, psp=0x7ffe6a90 sq_wrapper+0x5c ( arguments not stored ) pc=0x93188, pfmp=0x7ffe6a30, psp=0x7ffe6a50 csq_lateral+0x80 ( arguments not stored ) pc=0x96174, pfmp=0x7ffe69b0, psp=0x7ffe69d0 runq_run+0x58 ( arguments not stored ) pc=0x930d4, pfmp=0x7ffe6970, psp=0x7ffe6990 str_sched_daemon+0x1b0 ( arguments not stored ) pc=0x934e0, pfmp=0x7ffe68b0, psp=0x7ffe68d0 main+0xa04 ( arguments not stored ) pc=0x24c814, pfmp=0x7ffe67f0, psp=0x7ffe6810 $vstart+0x3d ( arguments not stored ) pc=0x1b1fa4, pfmp=0x7ffe67c0, psp=0x7ffe67e0 istackatbase+0x88 ( arguments not stored ) pc=0x1c5f0, pfmp=0xffffffe0, psp=0x0 (5) 1653183947 The problem occurs because the customer has an unusual configuration and then gets a certain error sequence. Specifically the customer has a dependent LU6.2 local LU with a single remote LU and 2 associated modes. One of these modes (APPC4K) has an initially active session configured (session limits 1,1,0,1) and the other mode (APPC1K) has an on demand session (limits 1,0,0,0). The problem only occurs with the modes in this order (alphabetically sorted by mode name). We have first had a time-out trying to activate the SDLC connection (so the initially active mode is marked as needing retry). We then get an Allocate for the on demand mode before the connection is retried. When we get ACTLU for the LU we try to activate both modes (internally queuing an INIT-SELF for APPC1K) and get into an infinite loop (the kernel trace buffer is full of logs SNA0026 for APPC4K). (6) 1653185306 The case of a N_PVC_DETACH indication was treated like a N_PVC_DETACH confirmation: The outage was not reported to the Node. (7) 1653185348 Following a X25 Virtual Circuit Reset, the customer has to restart SNAplus node/link to reestablish a connection with the Host. (8) 1653185454 Everything works fine for the first connection, but when the customer deactivates the connection from the Host he/she can't reactivate it unless the Link is stopped and started. (9) 1653191924 On R4.2 customer configured incoming peer SDLC connection. DTR was raised but instead of frames being received RX overrun and Lost interrupt events occured. Short frames could be received, frames could be transmitted. (10) 4701325399 APPC TPs will not work when one of the machines is migrated from the R4 line of releases to the R5 line. (11) 4701335000 Customer may get a variety of communication path errors on any of the services, but this is most likely to hit those who are running a large number of TPs. (12) 4701343178 The problem is actually the same as that fixed in SR1653183947 ,even though the stack looks different. The problem occurs because the customer has an unusual configuration and then gets a certain error sequence. Specifically the customer has a dependent LU6.2 local LU with a single remote LU and multiple associated modes which are configured as initially active. (13) 5003320341 Snaplus R4.1 ethernet (802.3) connection fail to recover if host link is inactive for greater than 1 & 1/2 hours. The R4.1 system reports the following error every 10secs after the 1 & 1/2 hrs in sna.aud: LAN T10SNA9706: Exceeded maximum connections allowed for 802.3 link LAN on node NODE1 After this error is recorded the only way to recover the connection is to stop and start the link used by that connection. (14) 5003287276 3270 sessions are not completely logged off when the user exits 3270 using Ctrl-C, or from the file pull down menu. PHNE_7360: (1) 1653155317 Intermittent System Panic caused by PU2.1 (2) 4701319681 Two QLLC link stations have been started. Deactivating one of the link station sometimes caused the system to panic with data page fault. This problem only occurs running QLLC over streams (Spider) X.25. (3) 4701317925 SDLC link causes system panic on Kittyhawk. Error message: psi0: Could not map (while) quad structure panic+0x10 nio_build_dma_quads + 0x344 snap_nio_write + 0x224 svphtx + 0x1a8 slphtfrm + 0x4f0 Defect Description: PHNE_9550: (1) 1653112474 This enhancement request has now been included on the main code for SNAPlus - so will be available in all future SNAP-LINK patches as standard. Because it uses the LAN inactivity timer which is a function of the DLPI interface, it requires the current DLPI patch to be installed, up until SNAplus R4.4 (hp-ux 10.20). (2) 1653153957 SNAplus always attempts to use the same LU even if there are other available LUs in the default LU pool. (3) 1653179788 The problem occurred because we fail to associate a new control block for an incoming dependent LU BIND with the associated SSCP control block. There is a window in the code where we can re-use a control block before it has been freed after a previous session using the same OAF/DAF has been brought down safely - ensure that under these circumstances we cleanly terminate all reference to the old (now dead) session, and set things up correctly for processing the new session activation. (4) 1653180125 The problem is caused by an unusual LU 6.2 SNA sequence talking to the mainframe: RX BIND (dependent, CL) TX BIND +ve RX FMH5 RQD2 (S.N. 1), TP starts Kill TP, TX -ve RSP, TX FMH7 CEB RQD1 RX FMH7 CEB RQD1 (S.N. 2), TX +ve (S.N. 8000) RX FMH5 BB RQD2 (S.N. 3), TP starts RX +ve (S.N. 8003) - we detect BETB condition as the send chain FSM is pending RSP with CEB in it and then decouple the SCB and RCB leading to later crash when we get lost locality from the TP being killed. The SNA is unusual because there are 2 FMH7 CEBs (Deallocate Abend), the Host's one looks to be superfluous but is allowed through by the APPC protocols. The Host also delays responding to the CEB we send until part way through the next bracket and uses the current bracket sequence number (so the response does not appear stray). (5) 1653183947 The fix is to prevent any looping for ACTLU processing which can only have 1 mode that can be processed. The customer could also prevent the problem by altering the configuration to make the APPC4K mode not initially active (change the session limits to 1,1,0,0) or remove one of the modes. This would also prevent some of the error logs that he will get with this configuration as the two modes compete for the LU. (6) 1653185306 The processing of a Disconnect Indication generates two calls to the Close Connection routine. (7) 1653185348 The processing of a Disconnect Indication generates two calls to the Close Connection routine. (8) 1653185454 The Connection Control Block structure is released before the reception of the X25 Disconnect Confirm, preventing the glue from completing the Node's CLOSE(LINK). (9) 1653191924 Problem was that the HMOD was being primed with a frame size of 5, the configured frame size in the connection record was not reflected in the link record. Altered the link record to point to link data of the first connection in all cases (already done for Host links). Also sent fix to HMOD. (10) 4701325399 The cause of the problem is the lack of a fully qualified LU name on the R4 side. The R5 behaves correctly and sends the Fully Qualified LU name but then it doesn't match the table entry on the R4 side. This fix makes the R4 Node smart enough to match the LU names. (11) 4701335000 When the customer has used up all of the Service Table entries, the node (PU2.1) will be unable to handle any more requests from the services causing various types of errors. (12) 4701343178 The problem can be resolved by either :- a) the customer can reconfigure his system to change the mode records to not have initially active sessions (alter from 1,1,0,1 to 1,1,0,0) or b) install the PU 2.1 node fix (version 207 of libsix1.a in R4.2). (13) 5003320341 No timer was implemented on the Ethernet connection. (14) 5003287276 The reason sessions aren't completely logged off is that 3270 exits with a TERMSLF instead of an UNBIND. Since IBM implementations currently send UNBINDs in this type of situation, it is reasonable to change our product to do the same. In fact, our SNAplus2 3270 product already behaves this way. PHNE_7360: (1) 1653155317 Problem seems to occur as a result of killing a TP running on a dependent LU6.2 session. (2) 4701319681 This problem is due to invalid data pointer when multiple link stations are started. (3) 4701317925 The problem is caused by the SDLC PSI driver failing to unmap memory on the Kittyhawk system. The panic was intended by the driver when it can't get a new map. The unmap bug was corrected and the Driver will report an error if there is no map. SR: 5003320341 5003287276 4701343178 4701335000 4701325399 4701319681 4701317925 1653191924 1653185454 1653185348 1653185306 1653183947 1653180125 1653179788 1653155317 1653153957 1653112474 Patch Files: /usr/conf/lib/libpsi0.a /usr/conf/lib/libsix1.a /usr/conf/lib/libsixet.a /usr/conf/lib/libsixfd.a /usr/conf/lib/libsixqs.a /usr/conf/lib/libsixtk.a what(1) Output: /usr/conf/lib/libsixqs.a: A.10.10.202 SNAplus R4.3 Streams QLLC (PHNE_9550 : 96/12/06 09:19:36) /usr/conf/lib/libsix1.a: A.10.10.206 SNAplus R4.3 PU 2.1 (PHNE_9550 : 96/12/05 17:33:17) /usr/conf/lib/libsixet.a: A.10.10.204 SNAplus R4.3 802.3 (PHNE_9550 : 96/12/06 17:16:18) /usr/conf/lib/libsixfd.a: A.10.10.203 SNAplus R4.3 FDDI (PHNE_9550 : 96/12/06 17:17:21) /usr/conf/lib/libsixtk.a: A.10.10.203 SNAplus R4.3 Token Ring (PHNE_9550 : 96/12/06 17:15:08) /usr/conf/lib/libpsi0.a: A.10.10.202 SNAplus R4.3 PSI Driver (PHNE_9550: 96/12/18 17:39:14) cksum(1) Output: 3300171726 181796 /usr/conf/lib/libsixqs.a 769702428 906248 /usr/conf/lib/libsix1.a 3979107712 187976 /usr/conf/lib/libsixet.a 3221261413 187228 /usr/conf/lib/libsixfd.a 1372567780 186912 /usr/conf/lib/libsixtk.a 2433785159 46936 /usr/conf/lib/libpsi0.a Patch Conflicts: None Patch Dependencies: s700: 10.10: PHNE_9551 s800: 10.10: PHNE_9551 Hardware Dependencies: None Other Dependencies: None Supersedes: PHNE_7360 Equivalent Patches: None Patch Package Size: 1720 Kbytes Installation Instructions: Please review all instructions and the Hewlett-Packard SupportLine User Guide or your Hewlett-Packard support terms and conditions for precautions, scope of license, restrictions, and, limitation of liability and warranties, before installing this patch. ------------------------------------------------------------ 1. Back up your system before installing a patch. 2. Login as root. 3. Copy the patch to the /tmp directory. 4. Move to the /tmp directory and unshar the patch: cd /tmp sh PHNE_9550 5a. For a standalone system, run swinstall to install the patch: swinstall -x autoreboot=true -x match_target=true \ -s /tmp/PHNE_9550.depot 5b. For a homogeneous NFS Diskless cluster run swcluster on the server to install the patch on the server and the clients: swcluster -i -b This will invoke swcluster in the interactive mode and force all clients to be shut down. WARNING: All cluster clients must be shut down prior to the patch installation. Installing the patch while the clients are booted is unsupported and can lead to serious problems. The swcluster command will invoke an swinstall session in which you must specify: alternate root path - default is /export/shared_root/OS_700 source depot path - /tmp/PHNE_9550.depot To complete the installation, select the patch by choosing "Actions -> Match What Target Has" and then "Actions -> Install" from the Menubar. 5c. For a heterogeneous NFS Diskless cluster: - run swinstall on the server as in step 5a to install the patch on the cluster server. - run swcluster on the server as in step 5b to install the patch on the cluster clients. By default swinstall will archive the original software in /var/adm/sw/patch/PHNE_9550. If you do not wish to retain a copy of the original software, you can create an empty file named /var/adm/sw/patch/PATCH_NOSAVE. Warning: If this file exists when a patch is installed, the patch cannot be deinstalled. Please be careful when using this feature. It is recommended that you move the PHNE_9550.text file to /var/adm/sw/patch for future reference. To put this patch on a magnetic tape and install from the tape drive, use the command: dd if=/tmp/PHNE_9550.depot of=/dev/rmt/0m bs=2k Special Installation Instructions: Stop SNA daemon before installing patch (snapstop daemon). After installing the patch start the SNA daemon (snapstart daemon).