Patch Name: PHKL_23835 Patch Description: s700 10.20 Disk sort algorithm; ncsize, vxfs tunables Creation Date: 01/04/05 Post Date: 01/04/10 Hardware Platforms - OS Releases: s700: 10.20 Products: N/A Filesets: JournalFS.VXFS-BASE-KRN OS-Core.CORE-KRN OS-Core.KERN-RUN Automatic Reboot?: Yes Status: General Release Critical: Yes PHKL_23835: PANIC HANG PHKL_18499: CORRUPTION Path Name: /hp-ux_patches/s700/10.X/PHKL_23835 Symptoms: PHKL_23835: (SR: 8606184951 CR: JAGad54153) System may panic or hang due to insufficient kernel virtual space. This is caused by the dynamic nature of VxFS inode cache which may consume large amount of memory and free them after use causing heavy memory fragmentation. PHKL_22308: (SR: 8606157905 CR: JAGad27235) The system sometimes takes a very long time to respond to a disk read/write request (could be up to several hundred seconds) while it is busy processing other I/O requests on the same disk, especially when there are sequential file accesses going on. PHKL_18334: Slow performance on HFS file systems due to DNLC performance scalability issues. Decreasing the size of the DNLC improves performance. PHKL_18499: File system corruption could occur on ufs in some situations where reads and writes were made simultaneously to the same disk block. Defect Description: PHKL_23835: (SR: 8606184951 CR: JAGad54153) VxFS inode caching causes memory fragmentation by occasionally, e.g. during backup, caching large amounts of inodes in memory for one process and then after inode utilization dies down freeing the inodes. This patch adds VxFS tunables that help to avoid the fragmentation. Resolution: Two tunables are added to the system: vx_ninode vx_noifree vx_ninode allows setting the maximum number of inodes that can be present in the VxFS in-memory inode cache. If set to zero (the default) the size is tuned according to how much physical memory the system has. Decreasing the maximum size of the inode cache reduces the fragmentation it causes. To see what size the inode cache is on a running system: # echo 'vxfs_ninode/D' | adb /stand/vmunix /dev/mem vxfs_ninode: vxfs_ninode: 8000 So that this system will have a maximum of 8000 inodes allocated in the VxFS cache. vx_noifree controls whether to free memory from the VxFS inode cache. If set to zero (the default), inodes are eventually freed back to the general memory pool if they are unused. If VX_NOIFREE is non-zero, then memory is never freed from the VxFS inode cache. It may seem counter-intuitive to hoard memory to prevent memory problems, but not freeing the 1KB buckets holding VxFS inodes to the general memory pool prevents fragmentation of the pool. Once the maximum size is reached for the inode cache, VxFS will always re-use older inodes. PHKL_22308: (SR: 8606157905 CR: JAGad27235) This is a fairness problem with the disk sort algorithm. The disk sort algorithm is used to reduce the disk head retractions. With this algorithm, all I/O requests with the same priority are queued in non-descending order of disk block number before being processed if the queue is not empty. When requests come in faster than they can be processed, the queue becomes longer, the time needed to perform one scan (from smallest block number to largest block number of the disk) could be very long in the worst case scenarios. It is unfair for the request which came in early but has been continuously pushed back to the end of the queue because it has a large block number or it just missed the current scan. These kind of unlucky requests could line up in the queue for as long as the time needed for processing a whole scan (which could take a few minutes). This situation usually happens when a process tries to access a disk while another process is performing sequential accesses to the same disk. Resolution: To prevent this problem from happening, we have to take the time aspect into consideration in the sorting algorithm. We add a time stamp for each request when it is enqueued, which is used as the second sorting key for the queue (1st key: process priority; 2nd key: enqueued time; 3rd key: block number). The granularity of the time stamp value is controlled by a new tunable "disksort_seconds". If we set "disksort_seconds" to N (N>0), for all the requests with the same priority, we can guarantee that any given request will be processed earlier than those which come in N seconds later than this request. Within each N second period (requests have the same time stamp), all requests are sorted by non-descending block number order. By choosing the right "disksort_seconds" value, we can balance the maximum waiting time of requests and the efficiency of disk accesses. The tunable parameter can be set to 0, 1, 2, 4, 8, 16, 32, 64, 128 or 256 second(s). If "disksort_seconds" is 0 (default value), the time stamp is disabled, which means that time aspect is not taking effect. PHKL_18334: The size of the DNLC may impact the performance of ufs file systems. A work around tested successfully is to tune the size of the DNLC smaller in order to improve performance. Resolution: Added a kernel tunable, ncsize, which can be set to reduce the size of the DNLC. PHKL_18499: A ufs routine was incorrectly merging raw device requests into the i/o request chain of a different type (read vs. write) which involved the same disk block. Resolution: The ufs routine was fixed to no longer merge requests of different types. SR: 1653281659 4701414698 8606157905 8606184951 Patch Files: /usr/conf/lib/libhp-ux.a(ufs_dsort.o) /usr/conf/lib/libvxfs_base.a(vx_config.o) /usr/conf/master.d/fs-tune /usr/conf/master.d/vxfs /usr/conf/space.h.d/core-hpux.h /usr/conf/space.h.d/fs-tune.h what(1) Output: /usr/conf/lib/libhp-ux.a(ufs_dsort.o): ufs_dsort.c $Date: 2000/09/29 07:23:40 $ $Revisio n: 1.20.98.14 $ PATCH_10.20 (PHKL_22308) /usr/conf/lib/libvxfs_base.a(vx_config.o): vx_config.c $Date: 2001/03/30 13:01:23 $ $Revision: 1.7.98.15 $ PATCH_10.20 (PHKL_23835) /usr/conf/master.d/fs-tune: fs-tune $Date: 2000/09/29 07:28:13 $ $Revision: 1.1. 98.4 $ PATCH_10.20 (PHKL_22308) /usr/conf/master.d/vxfs: vxfs $Date: 2001/03/30 13:05:45 $ $Revision: 1.4.9 8.3 $ PATCH_10.20 (PHKL_23835) /usr/conf/space.h.d/core-hpux.h: core-hpux.h $Date: 99/04/21 06:32:29 $ $Revision: 1.6.98.16 $ PATCH_10.20 (PHKL_18334) /usr/conf/space.h.d/fs-tune.h: fs-tune.h: $Date: 2001/03/30 13:04:36 $ $Revision: 1 .1.98.6 $ PATCH_10.20 (PHKL_23835) cksum(1) Output: 2012782560 9720 /usr/conf/lib/libhp-ux.a(ufs_dsort.o) 3511858625 8084 /usr/conf/lib/libvxfs_base.a(vx_config.o) 918853315 681 /usr/conf/master.d/fs-tune 3553762909 4672 /usr/conf/master.d/vxfs 3886057463 19203 /usr/conf/space.h.d/core-hpux.h 2576875247 246 /usr/conf/space.h.d/fs-tune.h Patch Conflicts: None Patch Dependencies: s700: 10.20: PHKL_16750 Hardware Dependencies: None Other Dependencies: None Supersedes: PHKL_18334 PHKL_18499 PHKL_22308 Equivalent Patches: PHKL_23836: s800: 10.20 PHKL_18141: s700: 11.00 Patch Package Size: 110 KBytes Installation Instructions: Please review all instructions and the Hewlett-Packard SupportLine User Guide or your Hewlett-Packard support terms and conditions for precautions, scope of license, restrictions, and, limitation of liability and warranties, before installing this patch. ------------------------------------------------------------ 1. Back up your system before installing a patch. 2. Login as root. 3. Copy the patch to the /tmp directory. 4. Move to the /tmp directory and unshar the patch: cd /tmp sh PHKL_23835 5a. For a standalone system, run swinstall to install the patch: swinstall -x autoreboot=true -x match_target=true \ -s /tmp/PHKL_23835.depot By default swinstall will archive the original software in /var/adm/sw/patch/PHKL_23835. If you do not wish to retain a copy of the original software, you can create an empty file named /var/adm/sw/patch/PATCH_NOSAVE. WARNING: If this file exists when a patch is installed, the patch cannot be deinstalled. Please be careful when using this feature. It is recommended that you move the PHKL_23835.text file to /var/adm/sw/patch for future reference. To put this patch on a magnetic tape and install from the tape drive, use the command: dd if=/tmp/PHKL_23835.depot of=/dev/rmt/0m bs=2k Special Installation Instructions: This patch depends on base patch PHKL_16750. For successful installation, please ensure that PHKL_16750 is in the same depot with this patch, or PHKL_16750 is already installed.