mounted machine down => df hangs

Greg Wageman greg at sj.ate.slb.com
Tue Nov 7 10:21:55 AEST 1989


In article <2652 at brazos.Rice.edu> rush at xanadu.llnl.gov (Alan Edwards) writes:
>X-Sun-Spots-Digest: Volume 8, Issue 180, message 14 of 15
>
>When one of our disk servers goes down, doing a 'df' on a machine that has
>the one of the disk server's partitions mounted, causes the 'df' process
>to hang PERMANENTLY.  The df process cannot be killed by kill -9.  Is
>there anything I can do to prevent this?  The machine that hangs is
>running SunOS 3.5.  Will this be fixed when we upgrade to 4.0.3?

This isn't a bug, it's a feature.

A process that tries to perform an access on a hard-mounted filesystem
from a down server will block in the kernel NFS code.  There it will
remain until the NFS code can complete the access.  Once the server comes
up, the operation completes without any indication of error to the process
in question- this is the reason for a hard mount.

You cannot "kill" such a process, as it is not running.  The signal is
queued for the process and won't be delivered until it unblocks-- which
means when the server comes back up, at which time it will resume running
anyway, and the operation will complete normally.

On the other hand, a soft-mounted filesystem will only block the process
until the timeout and retry counts are exhausted- at which time you'll get
an error to the console and the file access will return an error to the
program.

Since the purpose of NFS is to make remote filesystems appear
indistinguishable from local ones, and since a down server is not
considered a "normal condition", "df" is doing exactly what one would
expect.  Sorry.

Copyright 1989 Greg Wageman	DOMAIN: greg at sj.ate.slb.com
Schlumberger Technologies	UUCP:   {uunet,decwrl,amdahl}!sjsca4!greg
San Jose, CA 95110-1397		BIX: gwage  CIS: 74016,352  GEnie: G.WAGEMAN
        Permission granted for not-for-profit reproduction only.



More information about the Comp.sys.sun mailing list