Patch Name: PHNE_10957 Patch Description: s700 10.01 NFS Kernel Cumulative Megapatch Creation Date: 97/05/15 Post Date: 97/05/23 Warning: 97/06/18 - This Critical Warning has been issued by HP. - This patch introduces a race condition that can result in a process hang when performing normal IO across an NFS mount. The problem has been experienced by several customers and it appears that systems with high-speed networking interfaces, for example FDDI and 100Base-T, are most likely to experience the problem. - Patch PHNE_11370 is being released today to fix the problem. It is recommended that PHNE_10957 be removed and PHNE_11370 be installed to prevent this problem from occurring. Hardware Platforms - OS Releases: s700: 10.01 Products: N/A Filesets: OS-Core.CORE-KRN Automatic Reboot?: Yes Status: General Superseded With Warnings Critical: Yes PHNE_10957: CORRUPTION PANIC Overwritten rnode error in do_bio() Rename of jfs file system from PCNFS causes panic Data page fault in ckuwakeup() PHNE_9357: PANIC Data Page Fault in binvalfree_vfs() PHKL_8308: PANIC Data Page Fault in svc_getreqset() PHKL_6428: OTHER GETATTR calls to NFS-server even though valid entries are in cache causes poor performance PHKL_6132: PANIC Data Page fault in hash_export() PHKL_6000: PANIC Data Page fault in hash_export() Path Name: /hp-ux_patches/s700/10.X/PHNE_10957 Symptoms: PHNE_10957: 1. Some instances of NFS writes (such as cp from an NFS client) may complete successfully even when errors occur. 2. Renaming a VxFS file to another VxFS from a PCNFS client causes a panic. 3. Disabling anonymous access is not recognized by PCNFS clients, allowing them to run as a priviledged user. 4. Data page faults caused in the client 5. Directories in an NFS mounted file system are created with 000 permissions value. PHNE_9357: 1. A data page fault from binvalfree_vfs occurs in an MP environment. 2. The ability to switch O_SYNC behavior to O_DSYNC as described in PHKL_5971 was missing in PHKL_8308. 3. System hangs caused in large systems. 4. The length of a timeout for an NFS request may become extremely long (on the order of minutes). 5. The system does not initialize the vnode attributes when it sees a file size which is too large for the 10.20 file system, and returns an error. The server is making the directory anyway, with uninitialized (000) attributes. PHKL_9019: Client behaviour is incorrect with atttribute caching turned on. PHKL_8308: 1. The previous NFS Megapatch caused another data page fault in an MP environment due to synchronization problems with the client's biod's. 2. A panic within kernel RPC (in svr_getreqset) in an MP environment is generated due to another synchronization problem. 3. A process hang within the nfsd daemons may occur when NFS traffic is particularly heavy. PHKL_7983: The previous NFS Megapatch causes a data page fault due to synchronization problems with the client's rnode data. PHKL_7510: 1.When systems which support large UIDs (10.20 and up) are clients of or servers to systems supporting a smaller maximim UID, several types of symptoms may occur: - logins on NFS clients may receive incorrect access on NFS servers - files from NFS servers may appear to be owned by the wrong logins on NFS clients - setuid and setgid binaries available on NFS servers may allow client logins to run with incorrect access 2.Performance for MP clients on larger n-way systems may be less than desirable. 3. The nettl.TRCx file may grow excessively large when using JFS (VxFS) across NFS. 4. A previous patch, PHKL_7142, caused a data page fault panic. This problem could also manifest itself as a spinlock timeout (but is less likely to do so). PHKL_7142: Poor client performance PHKL_6428: 1. Unmounting NFS file system temporarily hangs client 2. Performance problems due to "invalid" NFS attribute cache causing too many attribute lookups. PHKL_6132: Process panic when cr_free attempts to free unused memory PHKL_6000: 1. panic : data page fault 2. trap type 15 3. occured in the hash_export() in the panic stack PHKL_5971: In 9.X releases, the performance and semantics of O_SYNC were the same as the 10.0X O_DSYNC. Unfortunately, this was a violation of the POSIX standard, which was fixed for the 10.0X releases. The consequence of the change was to make O_SYNC more robust but also slower than in 9.0X. Defect Description: PHNE_10957: 1. Error codes kept in the rnode for an NFS client's file may get overwritten, and therefore not reported back to the caller when the file is closed. 2. The NFS server renaming procedures do not check for differing VxFS file systems when asking for a rename, which will cause a panic down in VxFS. 3. The server authorization program does not properly check for anonymous access when user IDs of -2 are used. 4. The netisr callout function did not protect against a race condition. 5. The sattr_to_vattr routine does not initialized attribute values when a file size error is encountered. The calling routines should handle the error, and not perform the action (making a directory, or making a symbolic link). PHNE_9357: 1. The binvalfree_vfs algorithm did not recheck the status of a buffer cache pointer after acquiring the spinlock meant to protect the cache entry, allowing a race condition window between the initial check and the actual spinlock. 2. The variable and code used to switch O_SYNC behavior to O_DSYNC was not compiled into PHKL_8308. 3. Incorrect usage of the dnlc purge functions. 4. The maximum timeout values defined in RPC were very long, and neither RPC nor NFS values matched that of SUN. PHKL_9019: The nfs client code was not clearing an EOF flag where needed. PHKL_8308: 1. The kernel's biod support code did not sufficently protect against MP race conditions. 2. The RPC processor affinity implementation used by nfsd's was not sufficently protected against MP race conditions. 3. Two nfsd's can block on each other, waiting for resources that the other owns. PHKL_7983: The unprotected rnode data allowed tests to be made on stale data, while the action of the test was based on new data, causing the wrong file size calculations. PHKL_7510: 1.A future HP-UX release will increase the value of MAXUID, providing for a greater range of valid UIDs and GIDs. It will also introduce problems in mixed-mode NFS environments. Let "LUID" specify a machine running a version of HP-UX with large-UID capability. Let "SUID" specify a machine with current small-UID capability. The following problems may occur: LUID client, SUID server - A previous patch (PHKL_5079) makes client logins with UIDs outside the server's range appear as the anonymous user. However, the anonymous user UID is configurable, and is sometimes configured as the root user (in order to "trust" all client root logins without large-scale modifications to the /etc/exports file). Thus, all logins with large UIDs on the client could be mapped to root on the server. - If this previous patch has not been applied, files created by logins with large UIDs on the client will have the wrong UID on the server. This could be exploited by particular UIDs to gain root access on the server. - Files owned by the nobody user on the server will appear to be owned by the wrong user on the client. SUID client, LUID server - Files owned by large-UID logins on the server will appear to be owned by the wrong user on the client. - Executables with the setuid or setgid mode turned on will allow logins on the client to run as the wrong users. 2. MP clients use the file system semaphore within NFS, which is not an efficient synchronization technique. 3. VxFS attempts to do a Block read (BREAD) which is not supported in 10.01. The default read is then done, but a warning is sent to the log files. 4. MP client synchronization was incomplete in PHKL_7142. PHKL_7142: Poor client performance due to synchronization with global (file system) semaphore. PHKL_6428: 1.The algorithm for flushing buffer caches is inefficient, forcing multiple walks of the buffer cache. Large system memory forces large buffer caches, with the result being very slow cache flushes. 2.The attribute cache contents may seem invalid if the credential pointers differ, even though the actual credentials are the same. This would indicate an invalid attribute cache entry even though it was valid, forcing a unnecessary attribute lookup. PHKL_6132: The credential structure used in the kernel's NFS request dispatcher (rfs_dispatch) was not properly allocated, and request completes and the credential structure is released with cr_free(). (INDaa21734) PHKL_6000: The hash_export() used some of the fields in a file handle for hashing, and since nfs dosen't have any protection against getting a bogus file handle a panic situation occured. One way to get this panic is to send a bogus file handle to nfsd right after it loaded during boot up time. could, in some cases, be unavailable when the request completes and the credential structure is released withcr_free(). PHKL_5971: This patch provides the 9.0X O_SYNC behavior on a 10.0X system. On a 10.0X system, this is called O_DSYNC. SR: 5003352534 5003344226 5003343277 5003340042 5003330894 5003327338 5003326090 5003322370 5003321513 5003319665 5003319145 5003279927 5003279091 4701341669 4701314302 4701306837 4701306829 1653197632 1653192294 1653150599 1653146886 1653146308 1653134924 1653101337 Patch Files: /usr/conf/lib/libnfs.a(clnt_kudp.o) /usr/conf/lib/libnfs.a(nfs_export.o) /usr/conf/lib/libnfs.a(nfs_server.o) /usr/conf/lib/libnfs.a(nfs_subr.o) /usr/conf/lib/libnfs.a(nfs_vnops.o) /usr/conf/lib/libnfs.a(svc.o) what(1) Output: /usr/conf/lib/libnfs.a(clnt_kudp.o): clnt_kudp.c $Date: 97/05/07 16:29:37 $ $Revision: 1. 7.101.10 $ PATCH_10.01 PHNE_10957 /usr/conf/lib/libnfs.a(nfs_export.o): nfs_export.c $Date: 97/05/06 18:04:39 $ $Revision: 1.1.101.11 $ PATCH_10.01 PHNE_10957 /usr/conf/lib/libnfs.a(nfs_server.o): nfs_server.c $Date: 97/05/06 17:56:54 $ $Revision: 1.1.101.19 $ PATCH_10.01 PHNE_10957 /usr/conf/lib/libnfs.a(nfs_subr.o): nfs_subr.c $Date: 97/05/06 18:02:26 $ $Revision: 1. 1.101.15 $ PATCH_10.01 PHNE_10957 /usr/conf/lib/libnfs.a(nfs_vnops.o): nfs_vnops.c $Date: 97/05/06 17:55:51 $ $Revision: 1. 1.101.15 $ $ PATCH_10.01 PHNE_10957 /usr/conf/lib/libnfs.a(svc.o): svc.c $Date: 97/05/07 16:29:27 $ $Revision: 1.6.101 .7 $ PATCH_10.01 PHNE_10957 cksum(1) Output: 1976203435 11360 /usr/conf/lib/libnfs.a(clnt_kudp.o) 3715581565 9784 /usr/conf/lib/libnfs.a(nfs_export.o) 2740347324 28452 /usr/conf/lib/libnfs.a(nfs_server.o) 1873322213 21936 /usr/conf/lib/libnfs.a(nfs_subr.o) 988708430 34068 /usr/conf/lib/libnfs.a(nfs_vnops.o) 874162841 5736 /usr/conf/lib/libnfs.a(svc.o) Patch Conflicts: None Patch Dependencies: s700: 10.01: PHKL_8841 PHKL_10136 Hardware Dependencies: None Other Dependencies: None Supersedes: PHKL_5971 PHKL_6000 PHKL_6132 PHKL_6428 PHKL_7142 PHKL_7510 PHKL_7983 PHKL_8308 PHKL_9019 PHNE_9357 Equivalent Patches: PHKL_10958: s800: 10.01 Patch Package Size: 180 Kbytes Installation Instructions: Please review all instructions and the Hewlett-Packard SupportLine User Guide or your Hewlett-Packard support terms and conditions for precautions, scope of license, restrictions, and, limitation of liability and warranties, before installing this patch. ------------------------------------------------------------ 1. Back up your system before installing a patch. 2. Login as root. 3. Copy the patch to the /tmp directory. 4. Move to the /tmp directory and unshar the patch: cd /tmp sh PHNE_10957 5a. For a standalone system, run swinstall to install the patch: swinstall -x autoreboot=true -x match_target=true \ -s /tmp/PHNE_10957.depot 5b. For a homogeneous NFS Diskless cluster run swcluster on the server to install the patch on the server and the clients: swcluster -i -b This will invoke swcluster in the interactive mode and force all clients to be shut down. WARNING: All cluster clients must be shut down prior to the patch installation. Installing the patch while the clients are booted is unsupported and can lead to serious problems. The swcluster command will invoke an swinstall session in which you must specify: alternate root path - default is /export/shared_root/OS_700 source depot path - /tmp/PHNE_10957.depot To complete the installation, select the patch by choosing "Actions -> Match What Target Has" and then "Actions -> Install" from the Menubar. 5c. For a heterogeneous NFS Diskless cluster: - run swinstall on the server as in step 5a to install the patch on the cluster server. - run swcluster on the server as in step 5b to install the patch on the cluster clients. By default swinstall will archive the original software in /var/adm/sw/patch/PHNE_10957. If you do not wish to retain a copy of the original software, you can create an empty file named /var/adm/sw/patch/PATCH_NOSAVE. Warning: If this file exists when a patch is installed, the patch cannot be deinstalled. Please be careful when using this feature. It is recommended that you move the PHNE_10957.text file to /var/adm/sw/patch for future reference. To put this patch on a magnetic tape and install from the tape drive, use the command: dd if=/tmp/PHNE_10957.depot of=/dev/rmt/0m bs=2k Special Installation Instructions: If the patch dependencies are not satisfied, the kernel cannot be rebuilt once this patch is applied. The following is a useful process for applying more than one patch while only requiring a single reboot after the final patch installation: 1) Get the individual depots over into /tmp. 2) Make a new directory to contain the set of patches: mkdir /tmp/DEPOT # For example 3) For each patch "PHNE_10957": swcopy -s /tmp/PHNE_10957.depot \* @ /tmp/DEPOT 4) swinstall -x match_target=true -x autoreboot=true \ -s /tmp/DEPOT In order to use the O_SYNC option behave the same as the O_DSYNC option in order to have asynchronous inode updates for NFS writes you NEED TO MANUALLY modify the global variable o_sync_is_o_dsync using adb as follows: echo "o_sync_is_o_dsync?W 1" | \ adb -k -w /stand/vmunix /dev/mem echo "o_sync_is_o_dsync/W 1" | \ adb -k -w /stand/vmunix /dev/mem You can do this after the system is up and running without any problems. You can also go back to the 10.0X behavior by re-executing the command lines using a 0 instead of a 1. If you rebuild or reboot the kernel, you'll have to reapply the adb patch, since the default value is 0.