Patch Name: PHKL_24294 Patch Description: s700 10.20 smmap, munmap, fcntl/mmap, mmap64, anon mmap Creation Date: 01/06/04 Post Date: 01/06/12 Warning: 01/07/19 - This Critical Warning has been issued by HP. - PHKL_24294 introduced behavior that can cause a system panic on PA2.0 architecture systems executing certain types of applications. The resulting panic string will be similar to 'Returning ID that is already free'. - To avoid risk of this behavior, HP recommends PHKL_24294 be removed from all PA2.0 architecture systems unless there is a known requirement for the increased number of protection IDs allowed by the patch *and* no system panics, as described above, occur under normal workload. PHKL_24294 should also be removed from all software depots used to install patches on these systems. - The previous patch, PHKL_21925, does not exhibit this same behavior and is being re-released until a replacement patch is available. To ensure that as many other known issues as possible are addressed, HP recommends that PHKL_21925 be installed after PHKL_24294 is removed. If PHKL_21925 was installed prior to PHKL_24294, it will automatically be restored when PHKL_24294 is removed and it will not need to be re-installed. Warning: 02/02/08 - This Critical Warning has been issued by HP. - PHKL_24294 introduced behavior on PA2.0 architecture systems that can result in user processes that appear to hang and consume excess CPU resources, which could lead to system performance degradation. While the user processes appear to hang, they can be killed and system behavior will return to normal. - This behavior also exists with superseding patch PHKL_25093. - This behavior is caused by the enhancement introduced in PHKL_24294 to increase the protection ID on PA2.0 architecture from 15 to 17 bits. The protection fault handler was not enhanced to understand this change, so when user processes that share a memory region match the lower 15 bits of the protection id one of the user processes may recursively loop performing protection faults. - This behavior only occurs on PA2.0 architecture systems. It is most likely to occur on PA2.0 systems that execute a large number of user processes or threads, which can result in protection IDs using the 16th and 17th bits utilized on this architecture. The behavior has been reported with systems running the OpenMail application, upon which the ual.remote, item.browse, and user.ice processes appear to hang. - To avoid this behavior HP recommends removing PHKL_24294 and PHKL_25093 from PA2.0 architecture systems running OpenMail, as well as PA2.0 architecture systems upon which other user processes appear to hang. PHKL_24294 and PHKL_25093 should also be removed from software depots used to install patches on these systems. - The previous patches, PHKL_21779 and PHKL_21925, do not exhibit this same behavior. To ensure as many other known issues as possible are addressed, HP recommends that PHKL_21779 and PHKL_21925 be installed after PHKL_24294 and PHKL_25093 are removed. If PHKL_21779 and PHKL_21925 were installed prior to PHKL_24294 and PHKL_25093, they will automatically be restored when PHKL_24294 and PHKL_25093 are removed and they will not need to be re-installed. - The following files can be useful in determining which types of systems are PA2.0 architecture: /usr/lib/sched.models /usr/sam/lib/mo/sched.models /opt/langtools/lib/sched.models Hardware Platforms - OS Releases: s700: 10.20 Products: N/A Filesets: OS-Core.CORE-KRN Automatic Reboot?: Yes Status: General Superseded With Warnings Critical: Yes PHKL_24294: PANIC PHKL_21925: PANIC PHKL_20605: PANIC Path Name: /hp-ux_patches/s700/10.X/PHKL_24294 Symptoms: PHKL_24294: (SR: 8606182898 CR: JAGad52114) panic: "hdl_alloc_spaceid" occurs on a PA2.0 system when an application does more than 30k anonymous mmaps. A typical panic stack trace may look like: panic+0x10 hdl_alloc_id+0x1f0 hdl_alloc_spaceid+0x24 hdl_allocreg_spaceid+0x34 choose_shared_mmap_space+0x50 choose_space+0xcc hdl_attach+0x1a4 attachreg+0x88 smmap_common+0xa3c smmap+0x38 syscall+0x75c $syscallrtn+0x0 PHKL_21925: (SR: 8606136642 CR: JAGad05766) panic: "rmfree: overlap" when unmaping an mmap (shared mmf) segment. (SR: 8606146018 CR: JAGad15354) panic: "Data page fault" while trying to mmap maximum number of allowed pregions to a shared region. PHKL_20605: ( SR: 8606105836 CR: JAGab74182 ) Data page fault panic in hdl_choose_protid(). Stack trace looks like: panic+0x10 report_trap_or_int_and_panic+0xe8 trap+0x1054 $RDB_trap_patch+0x20 hdl_choose_protid+0xe4 hdl_changerange_0xe4 hdl_mprotect+0x404 choose_shared_mmap_space+0x2c8 choose_space+0xcc hdl_attach+0x1a4 attachreg+0x88 smmap_common+0x27c smmap+0x38 syscall+0x1a4 $syscallrtn+0x0 ( SR: 8606110048 CR: JAGab82751 ) Data page fault panic on multiprocessor system. Stack trace might look like: panic+0x10 report_trap_or_int_and_panic+0xe8 trap+0xa48 $RDB_trap_patch+0x20 hdl_range_same+0x68 hdl_changerange+0x90 hdl_mprotect+0x404 choose_shared_mmap_space+0x268 choose_space+0x9c hdl_attach+0x1a4 attachreg+0x88 smmap_common+0x728 smmap+0x38 syscall+0x1a4 or panic+0x10 report_trap_or_int_and_panic+0xe8 trap+0xa48 $call_trap+0x20 hdl_range_same+0x68 hdl_changerange+0x90 hdl_mprotect+0x404 do_shared_munmap+0xe8 do_munmap+0x14c foreach_pregion+0xb8 munmap+0x64 syscall+0x1a4 PHKL_19383: mmap64(2) returns an error when used to map portions of a file beyond the 2 GB file offset. PHKL_16880: It is possible for "munmap" to unexpectedly release the lock that is obtained from "fcntl". The application may experience an unexpected behavior because of this. Therefore, this patch is to rectify this unexpected behavior. Defect Description: PHKL_24294: (SR: 8606182898 CR: JAGad52114) There are two issues with this defect. First, we are allocating a spaceid for anonymous mmaps when it is not required. The second issue is the protection id, which is required for anonymous mmaps. The current implementation is based on PA1.1 hardware, which limits the size of the protid map to 32768. PA1.1 allows 16 bits of protection ID, so allowing one bit for write disable, that leaves us with 15 bits, or 2^15, which is 32768. However, the PA2.0 architecture allows 18 bits of protection ID, so allowing for the write disable bit, that gives us 2^17, or 131072. We should be taking advantage of the additional bits the PA2.0 architecture allows. Resolution: First, do not allocate a spaceid for anonymous mmaps. Second, dynamically size the protid map based on a runtime check of the cpu architecture. For PA1.1, we are limited to a size of 32768 as described above, so there will be no functional change for machines running PA1.1 chips. For machines running PA2.0 chips, the protid map will be dynamically increased to 131072, the maximum allowed by the hardware. PHKL_21925: (SR: 8606146018 CR: JAGad15354) The defect is due to a race condition in mmap(2) code which was using a recursive algorithm to map all of the file. A lock was being dropped and reacquired each time the algorithm recursed. While trying to do an attach operation, we drop the reglock() before we are done in the routine hdl_mmf_attach() as it was a recursive routine. Since hdl_mmf_attach() is a recursive routine, there can be a race condition with someone doing a attach operation and another process doing a detach/munmap operation. The fix is not to drop the reglock before being done with the attach operation thus removing the potential race conditions with someone doing a detach/munmap operation. Resolution: This patch fixes recursive hdl_mmf_attach() problem causing mmap/munmap MP race conditions (SR: 8606146018 CR: JAGad15354) The system panic's while trying to mmap() more than the maximum allowed limit of pregions to a shared region. (limited by r_refcnt, which is of type ushort). This was caused by r_refcnt overflow which caused it to reset. If a program mmap's more than this limit, the counter r_refcnt overflow which causes the system to panic. The fix is to check for the overflow and return ENOMEM. Resolution: This patch ensures that we check for maximum limit of pregions attached to a region and if that is reached we return ENOMEM. PHKL_20605: ( SR: 8606105836 CR: JAGab74182 ) When removing a pregion from a region's pregion list, the appropriate pregion structure field was not always cleared. Resolution: This patch ensures that the pregion structure is properly updated when we remove it from a region's pregion list. ( SR: 8606110048 CR: JAGab82751 ) A multiprocessor race condition between mmap/munmap resulted in attempting to access a subpregion before it had been initialized. Resolution: This patch ensures we initialize the subpregion before attempting to access it. PHKL_19383: mmap64(2) previously only supported file offsets up to 2 GB. Resolution: The mmap64(2) system call has been enhanced to support file offsets up to 4 GB. PHKL_16880: To correct the problem, the associations between "memory mapped file" and the system-wide file table reference count is removed. (i.e. The reference count is no longer stored in the system wide file table, however, the vnode reference counter still gets updated when "mmap" and "munmap".) SR: 1653270546 1653277004 8606105836 8606110048 8606136642 8606146018 8606182898 Patch Files: /usr/conf/lib/libhp-ux.a(hdl_init.o) /usr/conf/lib/libhp-ux.a(hdl_policy.o) /usr/conf/lib/libhp-ux.a(vm_mmap.o) /usr/conf/lib/libhp-ux.a(vm_pregion.o) what(1) Output: /usr/conf/lib/libhp-ux.a(hdl_init.o): hdl_init.c $Date: 2001/06/01 06:13:21 $ $Revision : 1.9.98.6 $ PATCH_10.20 (PHKL_24294) /usr/conf/lib/libhp-ux.a(hdl_policy.o): hdl_policy.c $Date: 2001/06/01 06:27:15 $ $Revision: 1.15.98.18 $ PATCH_10.20 (PHKL_24294) /usr/conf/lib/libhp-ux.a(vm_mmap.o): vm_mmap.c $Date: 2000/07/06 17:39:35 $ $Revision: 1. 17.98.22 $ PATCH_10.20 (PHKL_21925) /usr/conf/lib/libhp-ux.a(vm_pregion.o): vm_pregion.c $Date: 2000/06/29 11:25:43 $ $Revision: 1.16.98.16 $ PATCH_10.20 (PHKL_21925) cksum(1) Output: 132000780 6484 /usr/conf/lib/libhp-ux.a(hdl_init.o) 4204642415 10740 /usr/conf/lib/libhp-ux.a(hdl_policy.o) 4002093950 22464 /usr/conf/lib/libhp-ux.a(vm_mmap.o) 1687754910 12004 /usr/conf/lib/libhp-ux.a(vm_pregion.o) Patch Conflicts: None Patch Dependencies: s700: 10.20: PHKL_16750 Hardware Dependencies: None Other Dependencies: None Supersedes: PHKL_16880 PHKL_19383 PHKL_20605 PHKL_21925 Equivalent Patches: PHKL_24295: s800: 10.20 Patch Package Size: 120 KBytes Installation Instructions: Please review all instructions and the Hewlett-Packard SupportLine User Guide or your Hewlett-Packard support terms and conditions for precautions, scope of license, restrictions, and, limitation of liability and warranties, before installing this patch. ------------------------------------------------------------ 1. Back up your system before installing a patch. 2. Login as root. 3. Copy the patch to the /tmp directory. 4. Move to the /tmp directory and unshar the patch: cd /tmp sh PHKL_24294 5a. For a standalone system, run swinstall to install the patch: swinstall -x autoreboot=true -x match_target=true \ -s /tmp/PHKL_24294.depot By default swinstall will archive the original software in /var/adm/sw/patch/PHKL_24294. If you do not wish to retain a copy of the original software, you can create an empty file named /var/adm/sw/patch/PATCH_NOSAVE. WARNING: If this file exists when a patch is installed, the patch cannot be deinstalled. Please be careful when using this feature. It is recommended that you move the PHKL_24294.text file to /var/adm/sw/patch for future reference. To put this patch on a magnetic tape and install from the tape drive, use the command: dd if=/tmp/PHKL_24294.depot of=/dev/rmt/0m bs=2k Special Installation Instructions: This patch depends on base patch PHKL_16750. For successful installation, please ensure that PHKL_16750 is in the same depot with this patch, or PHKL_16750 is already installed.