Patch Name:  PHNE_10957

Patch Description: s700 10.01 NFS Kernel Cumulative Megapatch

Creation Date: 97/05/15

Post Date:  97/05/23

Warning: 97/06/18 - This Critical Warning has been issued by HP.

	- This patch introduces a race condition that can result
	  in a process hang when performing normal IO across an
	  NFS mount.  The problem has been experienced by several
	  customers and it appears that systems with high-speed
	  networking interfaces, for example FDDI and 100Base-T,
	  are most likely to experience the problem.
	- Patch PHNE_11370 is being released today to fix the
	  problem.  It is recommended that PHNE_10957 be removed
	  and PHNE_11370 be installed to prevent this problem from
	  occurring.

Hardware Platforms - OS Releases:
        s700: 10.01

Products: N/A

Filesets:
        OS-Core.CORE-KRN

Automatic Reboot?: Yes

Status: General Superseded With Warnings

Critical:
        Yes
        PHNE_10957: CORRUPTION PANIC
                Overwritten rnode error in do_bio()
                Rename of jfs file system from PCNFS causes panic
                Data page fault in ckuwakeup()
        PHNE_9357: PANIC
                Data Page Fault in binvalfree_vfs()
        PHKL_8308: PANIC
                Data Page Fault in svc_getreqset()
        PHKL_6428: OTHER
                GETATTR calls to NFS-server even though valid
                entries are in cache causes poor performance
        PHKL_6132: PANIC
                Data Page fault in hash_export()
        PHKL_6000: PANIC
                Data Page fault in hash_export()

Path Name:  /hp-ux_patches/s700/10.X/PHNE_10957

Symptoms:
        PHNE_10957:
        1. Some instances of NFS writes (such as cp from
        an NFS client) may complete successfully even when
        errors occur.

        2. Renaming a VxFS file to another VxFS from a PCNFS
        client causes a panic.

        3. Disabling anonymous access is not recognized
        by PCNFS clients, allowing them to run as a
        priviledged user.

        4. Data page faults caused in the client

        5. Directories in an NFS mounted file system are created
        with 000 permissions value.

        PHNE_9357:
        1. A data page fault from binvalfree_vfs occurs
        in an MP environment.

        2. The ability to switch O_SYNC behavior to O_DSYNC
        as described in PHKL_5971 was missing in PHKL_8308.

        3. System hangs caused in large systems.

        4. The length of a timeout for an NFS request may become
        extremely long (on the order of minutes).

        5. The system does not initialize the vnode attributes when
        it sees a file size which is too large for the 10.20 file
        system, and returns an error.  The server is making the
        directory anyway, with uninitialized (000) attributes.

        PHKL_9019:
        Client behaviour is incorrect with atttribute caching
        turned on.

        PHKL_8308:
        1. The previous NFS Megapatch caused another data page
        fault in an MP environment due to synchronization
        problems with the client's biod's.

        2. A panic within kernel RPC (in svr_getreqset) in an MP
        environment is generated due to another synchronization
        problem.

        3. A process hang within the nfsd daemons may occur when
        NFS traffic is particularly heavy.

        PHKL_7983:
        The previous NFS Megapatch causes a data page fault due
        to synchronization problems with the client's rnode data.

        PHKL_7510:
        1.When systems which support large UIDs (10.20 and up)
        are clients of or servers to systems supporting a smaller
        maximim UID, several types of symptoms may occur:
        - logins on NFS clients may receive incorrect access on
        NFS servers
        - files from NFS servers may appear to be owned by the
        wrong logins on NFS clients
        - setuid and setgid binaries available on NFS servers
        may allow client logins to run with incorrect access
        2.Performance for MP clients on larger n-way systems may
        be less than desirable.
        3. The nettl.TRCx file may grow excessively large when using
        JFS (VxFS) across NFS.
        4. A previous patch, PHKL_7142, caused a data page fault
        panic.  This problem could also manifest itself as a
        spinlock timeout (but is less likely to do so).

        PHKL_7142:
        Poor client performance

        PHKL_6428:
        1. Unmounting NFS file system temporarily hangs client
        2. Performance problems due to "invalid" NFS attribute cache
        causing too many attribute lookups.

        PHKL_6132:
        Process panic when cr_free attempts to free unused memory

        PHKL_6000:
        1. panic : data page fault
        2. trap type 15
        3. occured in the hash_export() in the panic stack

        PHKL_5971:
        In 9.X releases, the performance and semantics of
        O_SYNC were the same as the 10.0X O_DSYNC.  Unfortunately,
        this was a violation of the POSIX standard, which was fixed
        for the 10.0X releases.

        The consequence of the change was to make O_SYNC more robust
        but also slower than in 9.0X.

Defect Description:
        PHNE_10957:
        1. Error codes kept in the rnode for an NFS client's file
        may get overwritten, and therefore not reported back to the
        caller when the file is closed.

        2. The NFS server renaming procedures do not check for
        differing VxFS file systems when asking for a rename,
        which will cause a panic down in VxFS.

        3. The server authorization program does not properly
        check for anonymous access when user IDs of -2 are used.

        4. The netisr callout function did not protect against a
        race condition.

        5. The sattr_to_vattr routine does not initialized attribute
        values when a file size error is encountered.  The calling
        routines should handle the error, and not perform the
        action (making a directory, or making a symbolic link).

        PHNE_9357:
        1. The binvalfree_vfs algorithm did not recheck the
        status of a buffer cache pointer after acquiring the
        spinlock meant to protect the cache entry, allowing
        a race condition window between the initial check and
        the actual spinlock.

        2. The variable and code used to switch O_SYNC behavior
        to O_DSYNC was not compiled into PHKL_8308.

        3. Incorrect usage of the dnlc purge functions.

        4. The maximum timeout values defined in RPC were very long,
        and neither RPC nor NFS values matched that of SUN.

        PHKL_9019:
        The nfs client code was not clearing an EOF flag where
        needed.

        PHKL_8308:
        1. The kernel's biod support code did not sufficently
        protect against MP race conditions.

        2. The RPC processor affinity implementation used by nfsd's
        was not sufficently protected against MP race conditions.

        3. Two nfsd's can block on each other, waiting for resources
        that the other owns.

        PHKL_7983:
        The unprotected rnode data allowed tests to be made on
        stale data, while the action of the test was based on
        new data, causing the wrong file size calculations.

        PHKL_7510:
        1.A future HP-UX release will increase the value of MAXUID,
        providing for a greater range of valid UIDs and GIDs.  It
        will also introduce problems in mixed-mode NFS environments.
        Let "LUID" specify a machine running a version of HP-UX
        with large-UID capability.  Let "SUID" specify a machine
        with current small-UID capability.  The following problems
        may occur:

        LUID client, SUID server
        - A previous patch (PHKL_5079) makes client logins with
          UIDs outside the server's range appear as the
          anonymous user.  However, the anonymous user UID is
          configurable, and is sometimes configured as the root
          user (in order to "trust" all client root logins
          without large-scale modifications to the /etc/exports
          file).  Thus, all logins with large UIDs on the
          client could be mapped to root on the server.
        - If this previous patch has not been applied, files
          created by logins with large UIDs on the client will
          have the wrong UID on the server.  This could be
          exploited by particular UIDs to gain root access on
          the server.
        - Files owned by the nobody user on the server will
          appear to be owned by the wrong user on the client.

        SUID client, LUID server
        - Files owned by large-UID logins on the server will
          appear to be owned by the wrong user on the client.
        - Executables with the setuid or setgid mode turned
          on will allow logins on the client to run as the
          wrong users.

        2. MP clients use the file system semaphore within NFS,
           which is not an efficient synchronization technique.

        3. VxFS attempts to do a Block read (BREAD) which is not
           supported in 10.01.  The default read is then done, but
           a warning is sent to the log files.

        4. MP client synchronization was incomplete in PHKL_7142.

        PHKL_7142:
        Poor client performance due to synchronization with global
        (file system) semaphore.

        PHKL_6428:
        1.The algorithm for flushing buffer caches is inefficient,
        forcing multiple walks of the buffer cache.  Large system
        memory forces large buffer caches, with the result being
        very slow cache flushes.

        2.The attribute cache contents may seem invalid if the
        credential pointers differ, even though the actual
        credentials are the same.  This would indicate an invalid
        attribute cache entry even though it was valid, forcing a
        unnecessary attribute lookup.

        PHKL_6132:
        The credential structure used in the kernel's
        NFS request dispatcher (rfs_dispatch) was not properly
        allocated, and request completes and the credential
        structure is released with cr_free(). (INDaa21734)

        PHKL_6000:
        The hash_export() used some of the fields in a file handle
        for hashing, and since nfs dosen't have any protection
        against getting a bogus file handle a panic situation
        occured.  One way to get this panic is to send a bogus
        file handle to nfsd right after it loaded during boot up
        time. could, in some cases, be unavailable when the
        request completes and the credential structure is released
        withcr_free().

        PHKL_5971:
        This patch provides the 9.0X O_SYNC behavior on
        a 10.0X system.  On a 10.0X system, this is called O_DSYNC.

SR:
        5003352534 5003344226 5003343277 5003340042 5003330894
        5003327338 5003326090 5003322370 5003321513 5003319665
        5003319145 5003279927 5003279091 4701341669 4701314302
        4701306837 4701306829 1653197632 1653192294 1653150599
        1653146886 1653146308 1653134924 1653101337

Patch Files:
        /usr/conf/lib/libnfs.a(clnt_kudp.o)
        /usr/conf/lib/libnfs.a(nfs_export.o)
        /usr/conf/lib/libnfs.a(nfs_server.o)
        /usr/conf/lib/libnfs.a(nfs_subr.o)
        /usr/conf/lib/libnfs.a(nfs_vnops.o)
        /usr/conf/lib/libnfs.a(svc.o)

what(1) Output:
        /usr/conf/lib/libnfs.a(clnt_kudp.o):
                clnt_kudp.c $Date: 97/05/07 16:29:37 $ $Revision: 1.
                        7.101.10 $ PATCH_10.01 PHNE_10957
        /usr/conf/lib/libnfs.a(nfs_export.o):
                nfs_export.c  $Date: 97/05/06 18:04:39 $ $Revision:
                        1.1.101.11 $ PATCH_10.01 PHNE_10957
        /usr/conf/lib/libnfs.a(nfs_server.o):
                nfs_server.c  $Date: 97/05/06 17:56:54 $ $Revision:
                        1.1.101.19 $ PATCH_10.01 PHNE_10957
        /usr/conf/lib/libnfs.a(nfs_subr.o):
                nfs_subr.c  $Date: 97/05/06 18:02:26 $ $Revision: 1.
                        1.101.15 $ PATCH_10.01 PHNE_10957
        /usr/conf/lib/libnfs.a(nfs_vnops.o):
                nfs_vnops.c $Date: 97/05/06 17:55:51 $ $Revision: 1.
                        1.101.15 $  $ PATCH_10.01 PHNE_10957
        /usr/conf/lib/libnfs.a(svc.o):
                svc.c  $Date: 97/05/07 16:29:27 $ $Revision: 1.6.101
                        .7 $ PATCH_10.01 PHNE_10957

cksum(1) Output:
        1976203435 11360 /usr/conf/lib/libnfs.a(clnt_kudp.o)
        3715581565 9784 /usr/conf/lib/libnfs.a(nfs_export.o)
        2740347324 28452 /usr/conf/lib/libnfs.a(nfs_server.o)
        1873322213 21936 /usr/conf/lib/libnfs.a(nfs_subr.o)
        988708430 34068 /usr/conf/lib/libnfs.a(nfs_vnops.o)
        874162841 5736 /usr/conf/lib/libnfs.a(svc.o)

Patch Conflicts: None

Patch Dependencies:
        s700: 10.01: PHKL_8841 PHKL_10136

Hardware Dependencies:  None

Other Dependencies:  None

Supersedes:
        PHKL_5971 PHKL_6000 PHKL_6132 PHKL_6428 PHKL_7142 PHKL_7510
        PHKL_7983 PHKL_8308 PHKL_9019 PHNE_9357

Equivalent Patches:
        PHKL_10958:
        s800: 10.01

Patch Package Size:  180 Kbytes

Installation Instructions:
        Please review all instructions and the Hewlett-Packard
        SupportLine User Guide or your Hewlett-Packard support terms
        and conditions for precautions, scope of license,
        restrictions, and, limitation of liability and warranties,
        before installing this patch.
        ------------------------------------------------------------
        1. Back up your system before installing a patch.

        2. Login as root.

        3. Copy the patch to the /tmp directory.

        4. Move to the /tmp directory and unshar the patch:

                cd /tmp
                sh PHNE_10957

        5a. For a standalone system, run swinstall to install the
            patch:

                swinstall -x autoreboot=true -x match_target=true \
                        -s /tmp/PHNE_10957.depot

        5b. For a homogeneous NFS Diskless cluster run swcluster on the
            server to install the patch on the server and the clients:

                swcluster -i -b

            This will invoke swcluster in the interactive mode and
            force all clients to be shut down.

            WARNING: All cluster clients must be shut down prior to the
                     patch installation.  Installing the patch while the
                     clients are booted is unsupported and can lead to
                     serious problems.

            The swcluster command will invoke an swinstall session in which
            you must specify:

                alternate root path  -  default is /export/shared_root/OS_700
                source depot path    -  /tmp/PHNE_10957.depot

            To complete the installation, select the patch by choosing
            "Actions -> Match What Target Has" and then "Actions -> Install"
            from the Menubar.

        5c. For a heterogeneous NFS Diskless cluster:

                - run swinstall on the server as in step 5a to install
                  the patch on the cluster server.

                - run swcluster on the server as in step 5b to install
                  the patch on the cluster clients.

        By default swinstall will archive the original software in
        /var/adm/sw/patch/PHNE_10957.  If you do not wish to retain a
        copy of the original software, you can create an empty file
        named /var/adm/sw/patch/PATCH_NOSAVE.

        Warning: If this file exists when a patch is installed, the
                 patch cannot be deinstalled.  Please be careful
                 when using this feature.

        It is recommended that you move the PHNE_10957.text file to
        /var/adm/sw/patch for future reference.

        To put this patch on a magnetic tape and install from the
        tape drive, use the command:

                dd if=/tmp/PHNE_10957.depot of=/dev/rmt/0m bs=2k

Special Installation Instructions:
        If the patch dependencies are not satisfied, the kernel
        cannot be rebuilt once this patch is applied.

        The following is a useful process for applying more than one
        patch while only requiring a single reboot after the final
        patch installation:

         1) Get the individual depots over into /tmp.

         2) Make a new directory to contain the set of patches:
             mkdir /tmp/DEPOT # For example

         3) For each patch "PHNE_10957":

            swcopy -s /tmp/PHNE_10957.depot \* @ /tmp/DEPOT

         4) swinstall -x match_target=true -x autoreboot=true \
            -s /tmp/DEPOT

        In order to use the O_SYNC option behave the same as the
        O_DSYNC option in order to have asynchronous inode updates
        for NFS writes you NEED TO MANUALLY modify the global
        variable o_sync_is_o_dsync using adb as follows:

                echo "o_sync_is_o_dsync?W 1" | \
                       adb -k -w /stand/vmunix /dev/mem

                echo "o_sync_is_o_dsync/W 1" | \
                       adb -k -w /stand/vmunix /dev/mem

        You can do this after the system is up and running without
        any problems.  You can also go back to the 10.0X behavior by
        re-executing the command lines using a 0 instead of a 1.

        If you rebuild or reboot the kernel, you'll  have to reapply
        the adb patch, since the default value is 0.